How to Change a Value Coded as "Yes" to a Value of 1 in R

How to change yes/no in a column to 1 and 0

If we need to convert multiple values "N", "n", "no", "NO" and all others as "Yes" to 0 and 1, get the first character with substr, change it to upper case (toupper, do a comparison (!=) with "N" and coerce it to binary (as.integer)

library(dplyr)
clean %>%
mutate(flight = as.integer(toupper(substr(flight, 1, 1)) != "N"))

NOTE: Assume that there are only "Yes", "NO", "no", "N", "n" as values in the column

data

clean <- tibble(flight = c("No", "Yes", "YES", "Y", "no",
"No", "NO", "Y", "n", "y", "No"))

r program changing yes/no variable to 1/0 - variable 'medal' is not a factor

Apart from the obvious typo...

How can I change the yes/no to 0/1?

You need

sport$medal <- factor(sport$medal, levels = c("yes", "no"))

The default behaviour will give you 0 for "no" and 1 for "yes", as "n" comes ahead of "y" in alphabetical order.

How to translate values in a column to yes and no values for a multiple regression in R

I would create a new column - see two options below.

(NB in lm() you don't have to specify SB_xlsx13$ each time you add a covariate if you list it as the data = argument once! This will make your output easier to read.)

Tidyverse approach: mutate and case_when:

library(dplyr)
SB_xlsx13 <- SB_xlsx13 %>%
mutate(dnr_d3 = case_when(dnrday <= 3 ~ "yes",
dnrday > 3 ~ "no",
TRUE ~ NA_character_))

MLR_3 <- lm(hospdead ~ dzclass + age + sex + num.co + sps + dnr_d3,
data = SB_xlsx13)

Base R approach:

SB_xlsx13$dnr_d3[SB_xlsx13$dnrday <= 3] <- "yes"
SB_xlsx13$dnr_d3[SB_xlsx13$dnrday > 3] <- "no"
MLR_4 <- lm(hospdead ~ dzclass + age + sex + num.co + sps + dnr_d3,
data = SB_xlsx13)

Need to change data.table columns' value from Yes ,No to 1,0

You could do this with the set functionality in data.table:

1: Create a vector of columnnames in which you want to change the Yes to 1 and the No to 0 (like @Frank said in the comments)

cols <- grep("^HasProduct", names(DT), value = TRUE)

2: Change the values with the following for(...) set(...) implementation (as rightfully pointed out by @Arun in the comments, you can also use as.integer instead of just +):

for (col in cols) set(DT, j = col, value = +(DT[[col]] == "Yes"))

this results in:

> DT
x HasProduct1 HasProduct2 HasProduct3 HasProduct4 HasProduct5 HasProduct6 HasProduct7 HasProduct8 HasProduct9 HasProduct10
1: 23 0 1 0 1 0 0 1 0 0 0
2: 74 1 0 1 1 0 1 1 1 1 1
3: 35 1 1 0 0 0 1 1 1 0 1
4: 7 1 1 1 1 0 1 1 0 0 1
5: 92 0 1 1 1 1 1 0 1 1 0
---
9996: 56 0 0 1 0 1 0 0 0 1 0
9997: 59 1 0 1 1 0 1 1 1 1 0
9998: 85 0 1 0 1 1 1 1 1 1 1
9999: 93 1 0 0 0 0 0 0 0 1 1
10000: 29 0 1 1 0 0 1 0 1 1 1

Timings:

   user  system elapsed 
0.007 0.000 0.007

Used data:

set.seed(654)
product <- c("HasProduct1","HasProduct2","HasProduct3","HasProduct4","HasProduct5","HasProduct6","HasProduct7","HasProduct8","HasProduct9","HasProduct10")
DT <- as.data.table(data.frame(x=sample(1:100),sapply(product,function(x){x <-sample(c("Yes","No"),10000,replace = T)})))


Related Topics



Leave a reply



Submit