Change Rows into Columns in R With Values Yes/No (1/0)

Change rows into columns in R with values yes/no (1/0)

This solution works for your example...

library(tidyverse)
library(sjmisc)

d <- tibble(
IDBILL = c(111, 111, 112, 113, 113),
IDPRODUCT = c("ABC123", "ABC124", "BCH134", "ABC123", "GDF345"),
)

d %>%
tidyr::spread(key = IDPRODUCT, value = IDPRODUCT) %>%
sjmisc::rec_if(is.character, rec = "NA=0;else=1", append = F) %>%
tibble::add_column(IDBILL = unique(d$IDBILL), .before = 1)

#> # A tibble: 3 x 5
#> IDBILL ABC123_r ABC124_r BCH134_r GDF345_r
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 111 1 1 0 0
#> 2 112 0 0 1 0
#> 3 113 1 0 0 1

Step-by-step:

1) Spread the variable accross columns, but key and value are identical in this case, because of duplicated values: tidyr::spread(key = IDPRODUCT, value = IDPRODUCT) would throw this error: Error: Duplicate identifiers for rows (1, 2), (4, 5)

d %>% 
tidyr::spread(key = IDPRODUCT, value = IDPRODUCT)
#> # A tibble: 3 x 5
#> IDBILL ABC123 ABC124 BCH134 GDF345
#> <dbl> <chr> <chr> <chr> <chr>
#> 1 111 ABC123 ABC124 <NA> <NA>
#> 2 112 <NA> <NA> BCH134 <NA>
#> 3 113 ABC123 <NA> <NA> GDF345

2) Recode NA into 0, and all other values (else) into 1 - but only for character vectors (else, column IDBILL would be recoded as well). Note that rec_if(), just like e.g. select_if(), drops all other columns where .predicate does not apply.

d %>% 
tidyr::spread(key = IDPRODUCT, value = IDPRODUCT) %>%
sjmisc::rec_if(is.character, rec = "NA=0;else=1", append = F)
#> # A tibble: 3 x 4
#> ABC123_r ABC124_r BCH134_r GDF345_r
#> <dbl> <dbl> <dbl> <dbl>
#> 1 1 1 0 0
#> 2 0 0 1 0
#> 3 1 0 0 1

3) ... so we need to add back the ID column:

d %>% 
tidyr::spread(key = IDPRODUCT, value = IDPRODUCT) %>%
sjmisc::rec_if(is.character, rec = "NA=0;else=1", append = F) %>%
tibble::add_column(IDBILL = unique(d$IDBILL), .before = 1)
#> # A tibble: 3 x 5
#> IDBILL ABC123_r ABC124_r BCH134_r GDF345_r
#> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 111 1 1 0 0
#> 2 112 0 0 1 0
#> 3 113 1 0 0 1

Need to change data.table columns' value from Yes ,No to 1,0

You could do this with the set functionality in data.table:

1: Create a vector of columnnames in which you want to change the Yes to 1 and the No to 0 (like @Frank said in the comments)

cols <- grep("^HasProduct", names(DT), value = TRUE)

2: Change the values with the following for(...) set(...) implementation (as rightfully pointed out by @Arun in the comments, you can also use as.integer instead of just +):

for (col in cols) set(DT, j = col, value = +(DT[[col]] == "Yes"))

this results in:

> DT
x HasProduct1 HasProduct2 HasProduct3 HasProduct4 HasProduct5 HasProduct6 HasProduct7 HasProduct8 HasProduct9 HasProduct10
1: 23 0 1 0 1 0 0 1 0 0 0
2: 74 1 0 1 1 0 1 1 1 1 1
3: 35 1 1 0 0 0 1 1 1 0 1
4: 7 1 1 1 1 0 1 1 0 0 1
5: 92 0 1 1 1 1 1 0 1 1 0
---
9996: 56 0 0 1 0 1 0 0 0 1 0
9997: 59 1 0 1 1 0 1 1 1 1 0
9998: 85 0 1 0 1 1 1 1 1 1 1
9999: 93 1 0 0 0 0 0 0 0 1 1
10000: 29 0 1 1 0 0 1 0 1 1 1

Timings:

   user  system elapsed 
0.007 0.000 0.007

Used data:

set.seed(654)
product <- c("HasProduct1","HasProduct2","HasProduct3","HasProduct4","HasProduct5","HasProduct6","HasProduct7","HasProduct8","HasProduct9","HasProduct10")
DT <- as.data.table(data.frame(x=sample(1:100),sapply(product,function(x){x <-sample(c("Yes","No"),10000,replace = T)})))

Convert row values to column values with 1 for presence and 0 for absence in R dataframe

You can use dcast from data.table here:

dcast(dt, ID ~ CODE, fun.aggregate = function(x) 1L, fill = 0L)
ID F17 T65 Z49
1: 1245 0 0 1
2: 5567 1 0 1
3: 6623 0 1 1

data

library(data.table)
dt <- fread("ID Date CODE
5567 2012-09-02 F17
5567 2012-10-15 F17
5567 2012-11-11 Z49
1245 2010-09-02 Z49
6623 2012-10-15 T65
6623 2012-11-11 Z49")

Converting rows into columns based on the values in the rows, in R

You may try

library(tidyverse)

df %>%
rowwise %>%
mutate(reservation_main = str_split(reservation,'_' ,simplify = T)[1],
reservation_no = paste0('_',str_split(reservation,'_' ,simplify = T)[2])) %>%
select(id, response_id, reservation_main, reservation_no) %>%
pivot_wider(names_from = reservation_no, values_from = response_id)


id reservation_main `_1` `_2` `_3`
<dbl> <chr> <dbl> <dbl> <dbl>
1 31100 A 1 1 0
2 31100 B 1 1 0
3 31100 C 1 0 0

How to translate values in a column to yes and no values for a multiple regression in R

I would create a new column - see two options below.

(NB in lm() you don't have to specify SB_xlsx13$ each time you add a covariate if you list it as the data = argument once! This will make your output easier to read.)

Tidyverse approach: mutate and case_when:

library(dplyr)
SB_xlsx13 <- SB_xlsx13 %>%
mutate(dnr_d3 = case_when(dnrday <= 3 ~ "yes",
dnrday > 3 ~ "no",
TRUE ~ NA_character_))

MLR_3 <- lm(hospdead ~ dzclass + age + sex + num.co + sps + dnr_d3,
data = SB_xlsx13)

Base R approach:

SB_xlsx13$dnr_d3[SB_xlsx13$dnrday <= 3] <- "yes"
SB_xlsx13$dnr_d3[SB_xlsx13$dnrday > 3] <- "no"
MLR_4 <- lm(hospdead ~ dzclass + age + sex + num.co + sps + dnr_d3,
data = SB_xlsx13)


Related Topics



Leave a reply



Submit