Joining Factor Levels of Two Columns

Joining factor levels of two columns

You want the factors to include all the unique names from both columns.

col1 <- factor(c("Bob", "Tom", "Frank", "Jim", "Tom"))
col2 <- factor(c("John", "Bob", "Jane", "Bob", "Bob"))
mynames <- unique(c(levels(col1), levels(col2)))
fcol1 <- factor(col1, levels = mynames)
fcol2 <- factor(col2, levels = mynames)

EDIT: a little nicer if you replace the third line with this:

mynames <- union(levels(col1), levels(col2))

How to group by factor levels from two columns and output new column that shows sum of each level in R?

Instead of grouping by 'RawDate', group by 'ID', 'YEAR' and get the sum on a logical vector

library(dplyr)
complete_df %>%
group_by(ID, YEAR) %>%
mutate(TotalWon = sum(Renewal == 'WON'), TotalLost = sum(Renewal == 'LOST'))

If we need a summarised output, use summarise instead of mutate

How to collapse/join selected factor levels across two columns in R

First rename a, c, and d to x and then sum by dimensions and aspects

Reading the data:

df <- data.frame(dimensions = x, aspects = y, value = z, stringsAsFactors = FALSE)

Base R solution:

# if you read the data my way the following line is unnecessary
# df$aspects <- as.character(df$aspects)
df[df$aspects %in% c("a","c","d"),]$aspects <- "x"
aggregate(value ~., df, sum)

Result:

  dimensions aspects value
1 s1 b 2
2 s2 b 7
3 s3 b 12
4 s1 e 5
5 s2 e 10
6 s3 e 15
7 s1 x 8
8 s2 x 23
9 s3 x 38

data.table solution

require(data.table)
DT <- setDT(df)
DT[aspects %in% c("a","c","d"), aspects := "x"]
DT[,sum(value), by=.(dimensions, aspects)]

Results in

   dimensions aspects V1
1: s1 x 8
2: s1 b 2
3: s1 e 5
4: s2 x 23
5: s2 b 7
6: s2 e 10
7: s3 x 38
8: s3 b 12
9: s3 e 15

How to combine two columns of factors into one column without changing the factor levels into number

factors are numbers that happen to have labels. When you combine factors, you generally are combining their numeric values. This can often trip a person up.

If you want their labels, you must coerce them to strings, using as.character

 student.list <- c( as.character(dataset1[,2]) ,
as.character(dataset2[,2]) )

If you want to get that back to factors, wrap it all in as.factor (can be all in one line, or split into two lines for easier reading)

 student.list <- c(as.character(dataset1[,2]),as.character(dataset2[,2]))
student.list <- as.factor(student.list)

Is it possible to merge two ordered factor columns in a way that prioritizes the higher level factor?

Since you have ordered factor you can take maximum between the two factors.

library(dplyr)
left_join(x, y, by = "id") %>% transmute(id, var = pmax(var.x, var.y))

# id var
# <int> <ord>
#1 1 whole
#2 2 whole
#3 3 half
#4 4 whole


Related Topics



Leave a reply



Submit