Reshape Multi Id Repeated Variable Readings from Long to Wide

reshape2: dcast when there are multiple values for one cell but keep this values

This can be done with dcast (here from data.table) though you need a row identifier.

library(data.table)
dcast(dt, HLA_Status + rowid(HLA_Status, variable) ~ variable)
# HLA_Status HLA_Status_1 CCL24 SPP1
#1: PC 1 5.698 2.698
#2: PC 2 89.457 9.457
#3: PC 3 78.230 8.230
#4: PP 1 9.645 23.120
#5: PP 2 56.320 36.320
#6: PP 3 7.268 17.268

data

dt <- fread("    HLA_Status    variable      value
PP CCL24 9.645
PP CCL24 56.32
PP CCL24 7.268
PC CCL24 5.698
PC CCL24 89.457
PC CCL24 78.23
PP SPP1 23.12
PP SPP1 36.32
PP SPP1 17.268
PC SPP1 2.698
PC SPP1 9.457
PC SPP1 8.23")

Reshaping from long to wide format in R, problem with variables re-naming

With pivot_wider(), you can supply a glue specification that uses the names_from columns (and special .value) to create custom column names.

library(tidyr)
library(stringr)

df %>%
pivot_wider(
names_from = time,
names_glue = "{str_replace(.value, '(?=_)', str_c('_r', time))}",
values_from = WSAS_01)

# # A tibble: 2 × 3
# ID WSAS_r1_01 WSAS_r2_01
# <int> <int> <int>
# 1 1 4 3
# 2 2 6 8

In an extending case that values_from contains multiple values, this method also works:

df <- data.frame(
ID = rep(1:2, each = 2),
time = rep(1:2, 2),
WSAS_01 = c(4, 3, 6, 8),
WSAS_02 = c(1, 3, 5, 7)
)

df %>%
pivot_wider(
names_from = time,
names_glue = "{str_replace(.value, '(?=_)', str_c('_r', time))}",
values_from = starts_with("WSAS"))

# # A tibble: 2 × 5
# ID WSAS_r1_01 WSAS_r2_01 WSAS_r1_02 WSAS_r2_02
# <int> <dbl> <dbl> <dbl> <dbl>
# 1 1 4 3 1 3
# 2 2 6 8 5 7

How to convert a long data frame to a wide data frame with duplicates/triplicates?

do.call(cbind, lapply(split(df, df$y), function(a)
setNames(object = data.frame(a$x,
row.names = paste0(as.character(a$z), 1:NROW(a))),
nm = a$y[1])))
# 0 1 2
#a1 1 5 3
#a2 7 11 9
#b3 4 2 6
#b4 10 8 12

Reshape data frame in R with id, time and one column with several data variables

Thanks for such a clear question! Rare for a new user. I'd recommend reshape2 over reshape.

GDP <- subset(GDP, (s_adj == "SWDA") & (unit == "MIO_EUR") & (time > "1989Q4"),
select = c("geo", "time", "indic_na", "value"))
# Making your data match your example

library(reshape2)
GDP_wide <- dcast(GDP, geo + time ~ indic_na, value.var = "value")

> head(GDP_wide)
geo time B11 B111 B112 ...
1 AT 1990 Q1 -64.3 -1407.1 1337.6
2 AT 1990 Q2 -37.2 -1432.0 1450.3
3 AT 1990 Q3 -39.4 -1457.4 1544.2
4 AT 1990 Q4 -78.7 -1546.7 1592.7
5 AT 1991 Q1 -140.2 -1771.9 1583.0
6 AT 1991 Q2 -183.7 -1938.5 1568.3

From long to wide form without id.var?

I'm pretty sure this has been answered before. Anyway, unstack is convenient in this particular case with equal group size:

unstack(dat1, form = value ~ id)
# A B
# 1 1 5
# 2 2 6
# 3 3 7
# 4 4 8

Collapse duplicated rows with different values in different columns using R

For edited data and as per revised requirements. Since in alphabet b comes before s therefor bigger_year is shown before smaller_year however, in the real data you'll have correctly sorted years. Still if you want to sort strings like that use sort(desc(Year)) instead of sort(Year)

df <- data.frame(ID =  c('1','1','2', '2', '3','3'),
Year = c('smaller year.1', 'bigger year.1', 'bigger year.2', 'smaller year.2', 'same year.3', 'same year.3'),
V1 = c('a', 'b','c','d','e','f'),
V2 = c('g', 'h', 'i', 'j', 'k', 'l'),
Vn = c('n1', 'n2','n3','n4','n5','n6'))

library(tidyverse)

df %>% group_by(ID) %>% mutate(Year = sort(Year)) %>%
mutate(rid = row_number()) %>%
pivot_wider(id_cols = ID, names_from = rid, values_from = c(Year:Vn), names_sep = '')

#> # A tibble: 3 x 9
#> # Groups: ID [3]
#> ID Year1 Year2 V11 V12 V21 V22 Vn1 Vn2
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1 bigger year.1 smaller year.1 a b g h n1 n2
#> 2 2 bigger year.2 smaller year.2 c d i j n3 n4
#> 3 3 same year.3 same year.3 e f k l n5 n6

Created on 2021-06-19 by the reprex package (v2.0.0)



library(tidyverse)

df %>% group_by(ID) %>% mutate(rid = row_number()) %>%
pivot_wider(id_cols = ID, names_from = rid, values_from = c(Year:Variable_n), names_sep = '')

# A tibble: 3 x 9
# Groups: ID [3]
ID Year1 Year2 Variable_a1 Variable_a2 Variable_b1 Variable_b2 Variable_n1 Variable_n2
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 1 smaller year.1 bigger year.1 va11 va12 vb11 vb12 vn11 vn12
2 2 bigger year.2 smaller year.2 va21 va22 vb21 vb22 vn21 vn22
3 3 same year.3 same year.3 va31 va32 vb31 vb32 vn31 vn32

Do u mean this?


df %>% group_by(ID) %>% arrange(desc(Year)) %>% mutate(rid = row_number()) %>%
pivot_wider(id_cols = ID, names_from = rid, values_from = c(Year:Variable_n), names_sep = '')

# A tibble: 3 x 9
# Groups: ID [3]
ID Year1 Year2 Variable_a1 Variable_a2 Variable_b1 Variable_b2 Variable_n1 Variable_n2
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 2 smaller year.2 bigger year.2 va22 va21 vb22 vb21 vn22 vn21
2 1 smaller year.1 bigger year.1 va11 va12 vb11 vb12 vn11 vn12
3 3 same year.3 same year.3 va31 va32 vb31 vb32 vn31 vn32


Related Topics



Leave a reply



Submit