dummy variables to single categorical variable (factor) in R
A quick solution would be something like
Res <- cbind(df[1], VALUE = factor(max.col(df[-1]), ordered = TRUE))
Res
# Pre VALUE
# 1 1 6
# 2 1 5
# 3 1 5
# 4 1 5
str(Res)
# 'data.frame': 4 obs. of 2 variables:
# $ Pre : int 1 1 1 1
# $ VALUE: Ord.factor w/ 2 levels "5"<"6": 2 1 1 1
OR if you want the actual names of the columns (as Pointed by @BondedDust), you can use the same methodology to extract them
factor(names(df)[1 + max.col(df[-1])], ordered = TRUE)
# [1] VALUE_6 VALUE_5 VALUE_5 VALUE_5
# Levels: VALUE_5 < VALUE_6
OR you can use your own which
strategy in the following way (btw, which
is vectorized so no need in using apply
with a margin of 1 on it)
cbind(df[1], VALUE = factor(which(df[-1] == 1, arr.ind = TRUE)[, 2], ordered = TRUE))
OR you can do matrix
multiplication (contributed by @akrun)
cbind(df[1], VALUE = factor(as.matrix(df[-1]) %*% seq_along(df[-1]), ordered = TRUE))
Covert dummy variables to single categorical in R?
Loop over the selected columns by row (MARGIN = 1
), subset the column names where the value is 1 and paste
them together
df$z <- apply(df[c('a', 'b', 'c')], 1, function(x) toString(names(x)[x ==1]))
df$z
#[1] "b" "b, c" "b" "a, b, c" "a" "" "b" "" "a" ""
If we want to change the ""
to '0'
df$z[df$z == ''] <- '0'
For a solution with purrr and dplyr:
df %>% mutate(z = pmap_chr(select(., a, b, c), ~ {v1 <- c(...); toString(names(v1)[v1 == 1])}))
Convert various dummy/logical variables into a single categorical variable/factor from their name in R
Try:
library(dplyr)
library(tidyr)
df %>% gather(type, value, -id) %>% na.omit() %>% select(-value) %>% arrange(id)
Which gives:
# id type
#1 1 conditionA
#2 2 conditionB
#3 3 conditionC
#4 4 conditionD
#5 5 conditionA
Update
To handle the case you detailed in the comments, you could do the operation on the desired portion of the data frame and then left_join()
the other columns:
df %>%
select(starts_with("condition"), id) %>%
gather(type, value, -id) %>%
na.omit() %>%
select(-value) %>%
left_join(., df %>% select(-starts_with("condition"))) %>%
arrange(id)
Gathering multiple dummy variables as one categorical variable in R
If there is always a 1 and it is not repeated in a single row, then use max.col
to return the index of the max value in the row and with that index, subset the names
of the dataset
df$Category <- names(df)[-1][max.col(df[-1])]
df$Category
#[1] "Groceries" "Utilities" "Consumables" "Transportation" "Entertainment" "Misc"
Transform dummy variable into categorical variable
with tidyverse you could also do:
data %>%
pivot_longer(-ID) %>%
group_by(ID) %>%
slice(which.max(as.integer(factor(name))*value))%>%
mutate(name = if_else(value == 0, 'other',name), value= NULL)
# A tibble: 8 x 2
# Groups: ID [8]
ID name
<int> <chr>
1 1 Diag1
2 2 Diag2
3 3 Multiple.Diag
4 4 Multiple.Diag
5 5 Diag1
6 6 Diag3
7 7 Multiple.Diag
8 8 other
Reconstruct a categorical variable from dummies in R
You can do this with data.table
id_cols = c("x1", "x2")
data.table::melt.data.table(data = dt, id.vars = id_cols,
na.rm = TRUE,
measure = patterns("dummy"))
Example:
t = data.table(dummy_a = c(1, 0, 0), dummy_b = c(0, 1, 0), dummy_c = c(0, 0, 1), id = c(1, 2, 3))
data.table::melt.data.table(data = t,
id.vars = "id",
measure = patterns("dummy_"),
na.rm = T)[value == 1, .(id, variable)]
Output
id variable
1: 1 dummy_a
2: 2 dummy_b
3: 3 dummy_c
It's even easier if you remplaze 0 by NA, so na.rm = TRUE in melt will drop every row with NA
How to create dummy variables from a factor?
Just add another condition
four_six <- ifelse(cyl == 4, 0, ifelse(cyl==6, 1, NA))
or use dplyr::case_when
four_six <- dplyr::case_when(cyl==4 ~ 0, cyl==6 ~ 1)
Related Topics
Grouped Bar Graph Custom Colours
Adding a Legend to an Rgl 3D Plot
R Histogram from Frequency Table
Plot a Function with Several Arguments in R
How to Log Transform the Y-Axis of R Geom_Histogram in the Right Direction
How to Add Rows with 0 Counts to Summarised Output
How to Use Geom_Rect with Discrete Axis Values
Change Line Color Depending on Y Value with Ggplot2
Simulate an Ar(1) Process with Uniform Innovations
Transform One Column from Categoric to Binary, Keep the Rest
How to Merge Two Data Frame Based on Partial String Match with R
How to Know a Dimension of Matrix or Vector in R
Get Value of Last Non-Na Row Per Column in Data.Table
Add a Constant Value to All Rows in a Dataframe
Convert Data with One Column and Multiple Rows into Multi Column Multi Row Data
Scale Value Inside of Aes_String()
Convert from N X M Matrix to Long Matrix in R
Downgrade R Version (No Issues with Bioconductor Installation)