Dummy Variables to Single Categorical Variable (Factor) in R

dummy variables to single categorical variable (factor) in R

A quick solution would be something like

Res <- cbind(df[1], VALUE = factor(max.col(df[-1]), ordered = TRUE))
Res
#   Pre VALUE
# 1   1     6
# 2   1     5
# 3   1     5
# 4   1     5

str(Res)
# 'data.frame':  4 obs. of  2 variables:
# $ Pre  : int  1 1 1 1
# $ VALUE: Ord.factor w/ 2 levels "5"<"6": 2 1 1 1

OR if you want the actual names of the columns (as Pointed by @BondedDust), you can use the same methodology to extract them

factor(names(df)[1 + max.col(df[-1])], ordered = TRUE)
# [1] VALUE_6 VALUE_5 VALUE_5 VALUE_5
# Levels: VALUE_5 < VALUE_6

OR you can use your own which strategy in the following way (btw, which is vectorized so no need in using apply with a margin of 1 on it)

cbind(df[1], VALUE = factor(which(df[-1] == 1, arr.ind = TRUE)[, 2], ordered = TRUE))

OR you can do matrix multiplication (contributed by @akrun)

cbind(df[1], VALUE = factor(as.matrix(df[-1]) %*% seq_along(df[-1]), ordered = TRUE))

Covert dummy variables to single categorical in R?

Loop over the selected columns by row (MARGIN = 1), subset the column names where the value is 1 and paste them together

df$z <-  apply(df[c('a', 'b', 'c')], 1, function(x) toString(names(x)[x ==1]))
df$z
#[1] "b"       "b, c"    "b"       "a, b, c" "a"       ""        "b"       ""        "a"       ""

If we want to change the "" to '0'

df$z[df$z == ''] <- '0'

For a solution with purrr and dplyr:

df %>% mutate(z = pmap_chr(select(., a, b, c), ~  {v1 <- c(...); toString(names(v1)[v1 == 1])}))

Convert various dummy/logical variables into a single categorical variable/factor from their name in R

Try:

library(dplyr)
library(tidyr)

df %>% gather(type, value, -id) %>% na.omit() %>% select(-value) %>% arrange(id)

Which gives:

#  id       type
#1  1 conditionA
#2  2 conditionB
#3  3 conditionC
#4  4 conditionD
#5  5 conditionA

Update

To handle the case you detailed in the comments, you could do the operation on the desired portion of the data frame and then left_join() the other columns:

df %>% 
  select(starts_with("condition"), id) %>% 
  gather(type, value, -id) %>% 
  na.omit() %>% 
  select(-value) %>% 
  left_join(., df %>% select(-starts_with("condition"))) %>%
  arrange(id)

Gathering multiple dummy variables as one categorical variable in R

If there is always a 1 and it is not repeated in a single row, then use max.col to return the index of the max value in the row and with that index, subset the names of the dataset

df$Category <- names(df)[-1][max.col(df[-1])]
df$Category
#[1] "Groceries"      "Utilities"      "Consumables"    "Transportation" "Entertainment"  "Misc"

Transform dummy variable into categorical variable

with tidyverse you could also do:

data %>% 
  pivot_longer(-ID) %>%
  group_by(ID) %>%
  slice(which.max(as.integer(factor(name))*value))%>%
  mutate(name = if_else(value == 0, 'other',name), value= NULL)
 # A tibble: 8 x 2
# Groups:   ID [8]
     ID name         
  <int> <chr>        
1     1 Diag1        
2     2 Diag2        
3     3 Multiple.Diag
4     4 Multiple.Diag
5     5 Diag1        
6     6 Diag3        
7     7 Multiple.Diag
8     8 other

Reconstruct a categorical variable from dummies in R

You can do this with data.table

id_cols = c("x1", "x2") 
data.table::melt.data.table(data = dt, id.vars = id_cols, 
                            na.rm = TRUE, 
                            measure = patterns("dummy"))

Example:

t = data.table(dummy_a = c(1, 0, 0), dummy_b = c(0, 1, 0), dummy_c = c(0, 0, 1), id = c(1, 2, 3))
data.table::melt.data.table(data = t, 
                            id.vars = "id", 
                            measure = patterns("dummy_"), 
                            na.rm = T)[value == 1, .(id, variable)]

Output

   id variable
1:  1  dummy_a
2:  2  dummy_b
3:  3  dummy_c

It's even easier if you remplaze 0 by NA, so na.rm = TRUE in melt will drop every row with NA

How to create dummy variables from a factor?

Just add another condition

four_six <- ifelse(cyl == 4, 0, ifelse(cyl==6, 1, NA))

or use dplyr::case_when

four_six <- dplyr::case_when(cyl==4 ~ 0, cyl==6 ~ 1)

Dummy Variables to Single Categorical Variable (Factor) in R