How to Concatenate Factors, Without Them Being Converted to Integer Level

How to concatenate factors, without them being converted to integer level?

From the R Mailing list:

unlist(list(facs[1 : 3], facs[4 : 5]))

To 'cbind' factors, do

data.frame(facs[1 : 3], facs[4 : 5])

How to combine two columns of factors into one column without changing the factor levels into number

factors are numbers that happen to have labels. When you combine factors, you generally are combining their numeric values. This can often trip a person up.

If you want their labels, you must coerce them to strings, using as.character

 student.list <- c( as.character(dataset1[,2]) ,
as.character(dataset2[,2]) )

If you want to get that back to factors, wrap it all in as.factor (can be all in one line, or split into two lines for easier reading)

 student.list <- c(as.character(dataset1[,2]),as.character(dataset2[,2]))
student.list <- as.factor(student.list)

how to convert factor levels to integer in r

We can use match with unique elements

library(dplyr)
dat %>%
mutate_all(funs(match(., unique(.))))
# ID Season Year Weekday
#1 1 1 1 1
#2 2 1 2 2
#3 3 2 1 1
#4 4 2 2 3

Combine multiple factor columns into a single numeric column

Here is another base R method, where we replace non-blank value in the column with the numeric part in the column name using sub.

df[] <- t(as.integer(sub(".*?(\\d+)", "\\1", names(df))) * t(df != ""))
df
# q.82 q.77 q.72
#1 0 77 0
#2 82 0 0
#3 82 0 0
#4 0 0 72
#5 0 0 72

and then if you want to row-wise sum the values you can use rowSums

df$q <- rowSums(df)

Concatenating two vectors in R

We need to convert the factor class to character class

c(as.character(a), as.character(b))

The reason we get numbers instead of the character is based on the storage mode of factor i.e. an integer. So when we do the concatenation, it coerces to the integer mode

Combining factor levels in R 3.2.1

I've always found it easiest (less typing and less headache) to convert to character and back for these sorts of operations. Keeping with your as.data.frame.table and using replace to do the replacement of the low-frequency levels:

whittle <- function(data, cutoff_val) {
tab = as.data.frame.table(table(data))
factor(replace(as.character(data), data %in% tab$data[tab$Freq < cutoff_val], "Other"))
}

Testing on some sample data:

state <- factor(c("MD", "MD", "MD", "VA", "TX"))
whittle(state, 2)
# [1] MD MD MD Other Other
# Levels: MD Other

How to convert a factor variable to numeric while preserving the numbers in R


dv$ICPSR <- as.numeric(as.character(dv$ICPSR))

Transform your factor to a character vector before transforming it into a numeric vector.

Convert factor to integer

You can combine the two functions; coerce to characters thence to numerics:

> fac <- factor(c("1","2","1","2"))
> as.numeric(as.character(fac))
[1] 1 2 1 2

Converting factor variable to numeric, and from numeric back to factor

Before coercing the factors to numeric, create a lookup table of numeric-factor label pairs. At the end of your workflow, merge the factor labels back into your data.

library(dplyr)
data(warpbreaks)
original <- warpbreaks

value_label_map <- warpbreaks %>%
select(wool, tension) %>%
mutate(wool_num = as.numeric(wool), tension_num = as.numeric(tension)) %>%
distinct()

warpbreaks <- warpbreaks %>%
mutate(wool = as.numeric(wool), tension = as.numeric(tension))

warpbreaks <- left_join(warpbreaks, value_label_map,
by = c("wool" = "wool_num", "tension" = "tension_num"))

identical(original$wool, warpbreaks$wool.y)
identical(original$tension, warpbreaks$tension.y)


Related Topics



Leave a reply



Submit