How to change yes/no in a column to 1 and 0
If we need to convert multiple values "N", "n", "no", "NO" and all others as "Yes" to 0 and 1, get the first character with substr
, change it to upper case (toupper
, do a comparison (!=
) with "N" and coerce it to binary (as.integer
)
library(dplyr)
clean %>%
mutate(flight = as.integer(toupper(substr(flight, 1, 1)) != "N"))
NOTE: Assume that there are only "Yes", "NO", "no", "N", "n" as values in the column
data
clean <- tibble(flight = c("No", "Yes", "YES", "Y", "no",
"No", "NO", "Y", "n", "y", "No"))
r program changing yes/no variable to 1/0 - variable 'medal' is not a factor
Apart from the obvious typo...
How can I change the yes/no to 0/1?
You need
sport$medal <- factor(sport$medal, levels = c("yes", "no"))
The default behaviour will give you 0 for "no" and 1 for "yes", as "n" comes ahead of "y" in alphabetical order.
How to translate values in a column to yes and no values for a multiple regression in R
I would create a new column - see two options below.
(NB in lm()
you don't have to specify SB_xlsx13$
each time you add a covariate if you list it as the data =
argument once! This will make your output easier to read.)
Tidyverse approach: mutate
and case_when
:
library(dplyr)
SB_xlsx13 <- SB_xlsx13 %>%
mutate(dnr_d3 = case_when(dnrday <= 3 ~ "yes",
dnrday > 3 ~ "no",
TRUE ~ NA_character_))
MLR_3 <- lm(hospdead ~ dzclass + age + sex + num.co + sps + dnr_d3,
data = SB_xlsx13)
Base R approach:
SB_xlsx13$dnr_d3[SB_xlsx13$dnrday <= 3] <- "yes"
SB_xlsx13$dnr_d3[SB_xlsx13$dnrday > 3] <- "no"
MLR_4 <- lm(hospdead ~ dzclass + age + sex + num.co + sps + dnr_d3,
data = SB_xlsx13)
Need to change data.table columns' value from Yes ,No to 1,0
You could do this with the set
functionality in data.table:
1: Create a vector of columnnames in which you want to change the Yes
to 1
and the No
to 0
(like @Frank said in the comments)
cols <- grep("^HasProduct", names(DT), value = TRUE)
2: Change the values with the following for(...) set(...)
implementation (as rightfully pointed out by @Arun in the comments, you can also use as.integer
instead of just +
):
for (col in cols) set(DT, j = col, value = +(DT[[col]] == "Yes"))
this results in:
> DT
x HasProduct1 HasProduct2 HasProduct3 HasProduct4 HasProduct5 HasProduct6 HasProduct7 HasProduct8 HasProduct9 HasProduct10
1: 23 0 1 0 1 0 0 1 0 0 0
2: 74 1 0 1 1 0 1 1 1 1 1
3: 35 1 1 0 0 0 1 1 1 0 1
4: 7 1 1 1 1 0 1 1 0 0 1
5: 92 0 1 1 1 1 1 0 1 1 0
---
9996: 56 0 0 1 0 1 0 0 0 1 0
9997: 59 1 0 1 1 0 1 1 1 1 0
9998: 85 0 1 0 1 1 1 1 1 1 1
9999: 93 1 0 0 0 0 0 0 0 1 1
10000: 29 0 1 1 0 0 1 0 1 1 1
Timings:
user system elapsed
0.007 0.000 0.007
Used data:
set.seed(654)
product <- c("HasProduct1","HasProduct2","HasProduct3","HasProduct4","HasProduct5","HasProduct6","HasProduct7","HasProduct8","HasProduct9","HasProduct10")
DT <- as.data.table(data.frame(x=sample(1:100),sapply(product,function(x){x <-sample(c("Yes","No"),10000,replace = T)})))
Related Topics
R How to Read a File from Google Drive Using R
The Result of Rpart Is Just with 1 Root
R - Run Source() in Background
R Table Function: How to Sum Instead of Counting
How to Screenshot a Website Using R
Adding Total/Subtotal to the Bottom of a Datatable in Shiny
Adding Time to Posixct Object in R
How to Name the List of the Group_Split Output in Dplyr
Replace Characters in Column Names Gsub
Change Internal Function of a Package
How to Read CSV Data with Unknown Encoding in R
Creating a Continuous Heat Map in R
How to Export an Excel Sheet Range to a Picture, from Within R
Creating a Heat Map from (X,Y) Corrdinates in R