How to Make a Dummy Variable in R

Simple way of creating dummy variable in R

We can create a logical vector by df$Z < 0 and then coerce it to binary by wrapping with +.

 df$D <- +(df$Z <0)

Or as @BenBolker mentioned, the canonical options would be

as.numeric(df$Z < 0)

or

as.integer(df$Z < 0)

Benchmarks

set.seed(42)
Z <- rnorm(1e7)
library(microbenchmark)
microbenchmark(akrun= +(Z < 0), etienne = ifelse(Z < 0, 1, 0),
times= 20L, unit='relative')
# Unit: relative
# expr min lq mean median uq max neval
# akrun 1.00000 1.00000 1.000000 1.00000 1.00000 1.000000 20
# etienne 12.20975 10.36044 9.926074 10.66976 9.32328 7.830117 20

How to create a dummy variable in R by comparing the string values in a column

Your comment leads me to believe that you have a factor variable, so you should therefore first convert to a character vector and then convert to a numeric. The "random values" you are seeing are the integer indices into the factor levels attribute:

 dfrm$newcol <- as.numeric(as.character(dfrm$oldcol))>55  +0

The "+0" is in there to convert logical to numeric. Could also use as.integer or as.numeric around the whole expression.

Create Dummy Variable if NA 1 else -1

If it is a real NA, then we can use is.na to detect the NA elements, which would return TRUE for all NA and FALSE for others as a logical vector, which can be used in ifelse to change the values

ifelse(is.na(Cox.Reg$active_task_avg_depth), 1, -1)

Or another option is to create a numeric index and change the values accordingly

c(-1, 1)[is.na(Cox.Reg$active_task_avg_depth) + 1]

Using model.matrix() to create dummy variables

We could also convert to character

dataframe1$x1 <-  as.character(dataframe1$x1)
> model.matrix(~x1 - 1, dataframe1)
x11 x12 x13 x14 x15
1 1 0 0 0 0
2 0 1 0 0 0
3 0 0 1 0 0
4 0 0 0 1 0
5 0 0 0 0 1

How do I make a dummy variable in R?

With most of R's modelling tools with a formula interface you don't need to create dummy variables, the underlying code that handles and interprets the formula will do this for you. If you want a dummy variable for some other reason then there are several options. The easiest (IMHO) is to use model.matrix():

set.seed(1)
dat <- data.frame(sex = sample(c("male","female"), 10, replace = TRUE))

model.matrix( ~ sex - 1, data = dat)

which gives:

> dummy <- model.matrix( ~ sex - 1, data = dat)
> dummy
sexfemale sexmale
1 0 1
2 0 1
3 1 0
4 1 0
5 0 1
6 1 0
7 1 0
8 1 0
9 1 0
10 0 1
attr(,"assign")
[1] 1 1
attr(,"contrasts")
attr(,"contrasts")$sex
[1] "contr.treatment"

> dummy[,1]
1 2 3 4 5 6 7 8 9 10
0 0 1 1 0 1 1 1 1 0

You can use either column of dummy as a numeric dummy variable; choose whichever column you want to be the 1-based level. dummy[,1] chooses 1 as representing the female class and dummy[,2] the male class.

Cast this as a factor if you want it to be interpreted as a categorical object:

> factor(dummy[, 1])
1 2 3 4 5 6 7 8 9 10
0 0 1 1 0 1 1 1 1 0
Levels: 0 1

But that is defeating the object of factor; what is 0 again?

How to create a dummy variable in R using ifelse() command

Assuming your data frame is called df, you can create your dummy variable (Vegan) using:

df$Vegan <- ifelse(df$type == "Vegan", 1, 0) # where variable type is type of restaurants 

However, you should note that if type is a stored as factor, you can also get the coefficient on each type of restaurants (compared to the reference level) using y=b0+b1(reviews_number)+b2(type) i.e. y~reviews+type, as pointed by @mlt.

Loop over data.frame columns to generate dummy variable in R

dt[, 69:135] == 1 will return TRUE if the value in column 69:135 is 1 and FALSE otherwise.

dt[, 178:244] == 2 will return TRUE if the value in column 178:244 is 2 and FALSE otherwise.

You can perform an AND (&) operation between them to compare them elementwise meaning dt[, 69] & dt[, 178], dt[, 70] & dt[, 179] and so on. Take rowwise sum of them and mark it as 'Yes' even if a single TRUE is found in that row.

dt$left_region <- ifelse(rowSums(dt[, 69:135] == 1 & dt[, 178:244] == 2) > 0, 'yes', 'no')


Related Topics



Leave a reply



Submit