reshape2: dcast when there are multiple values for one cell but keep this values
This can be done with dcast
(here from data.table
) though you need a row identifier.
library(data.table)
dcast(dt, HLA_Status + rowid(HLA_Status, variable) ~ variable)
# HLA_Status HLA_Status_1 CCL24 SPP1
#1: PC 1 5.698 2.698
#2: PC 2 89.457 9.457
#3: PC 3 78.230 8.230
#4: PP 1 9.645 23.120
#5: PP 2 56.320 36.320
#6: PP 3 7.268 17.268
data
dt <- fread(" HLA_Status variable value
PP CCL24 9.645
PP CCL24 56.32
PP CCL24 7.268
PC CCL24 5.698
PC CCL24 89.457
PC CCL24 78.23
PP SPP1 23.12
PP SPP1 36.32
PP SPP1 17.268
PC SPP1 2.698
PC SPP1 9.457
PC SPP1 8.23")
Reshaping from long to wide format in R, problem with variables re-naming
With pivot_wider()
, you can supply a glue specification that uses the names_from
columns (and special .value
) to create custom column names.
library(tidyr)
library(stringr)
df %>%
pivot_wider(
names_from = time,
names_glue = "{str_replace(.value, '(?=_)', str_c('_r', time))}",
values_from = WSAS_01)
# # A tibble: 2 × 3
# ID WSAS_r1_01 WSAS_r2_01
# <int> <int> <int>
# 1 1 4 3
# 2 2 6 8
In an extending case that values_from
contains multiple values, this method also works:
df <- data.frame(
ID = rep(1:2, each = 2),
time = rep(1:2, 2),
WSAS_01 = c(4, 3, 6, 8),
WSAS_02 = c(1, 3, 5, 7)
)
df %>%
pivot_wider(
names_from = time,
names_glue = "{str_replace(.value, '(?=_)', str_c('_r', time))}",
values_from = starts_with("WSAS"))
# # A tibble: 2 × 5
# ID WSAS_r1_01 WSAS_r2_01 WSAS_r1_02 WSAS_r2_02
# <int> <dbl> <dbl> <dbl> <dbl>
# 1 1 4 3 1 3
# 2 2 6 8 5 7
How to convert a long data frame to a wide data frame with duplicates/triplicates?
do.call(cbind, lapply(split(df, df$y), function(a)
setNames(object = data.frame(a$x,
row.names = paste0(as.character(a$z), 1:NROW(a))),
nm = a$y[1])))
# 0 1 2
#a1 1 5 3
#a2 7 11 9
#b3 4 2 6
#b4 10 8 12
Reshape data frame in R with id, time and one column with several data variables
Thanks for such a clear question! Rare for a new user. I'd recommend reshape2
over reshape
.
GDP <- subset(GDP, (s_adj == "SWDA") & (unit == "MIO_EUR") & (time > "1989Q4"),
select = c("geo", "time", "indic_na", "value"))
# Making your data match your example
library(reshape2)
GDP_wide <- dcast(GDP, geo + time ~ indic_na, value.var = "value")
> head(GDP_wide)
geo time B11 B111 B112 ...
1 AT 1990 Q1 -64.3 -1407.1 1337.6
2 AT 1990 Q2 -37.2 -1432.0 1450.3
3 AT 1990 Q3 -39.4 -1457.4 1544.2
4 AT 1990 Q4 -78.7 -1546.7 1592.7
5 AT 1991 Q1 -140.2 -1771.9 1583.0
6 AT 1991 Q2 -183.7 -1938.5 1568.3
From long to wide form without id.var?
I'm pretty sure this has been answered before. Anyway, unstack
is convenient in this particular case with equal group size:
unstack(dat1, form = value ~ id)
# A B
# 1 1 5
# 2 2 6
# 3 3 7
# 4 4 8
Collapse duplicated rows with different values in different columns using R
For edited data and as per revised requirements. Since in alphabet b
comes before s
therefor bigger_year
is shown before smaller_year
however, in the real data you'll have correctly sorted years. Still if you want to sort strings like that use sort(desc(Year))
instead of sort(Year)
df <- data.frame(ID = c('1','1','2', '2', '3','3'),
Year = c('smaller year.1', 'bigger year.1', 'bigger year.2', 'smaller year.2', 'same year.3', 'same year.3'),
V1 = c('a', 'b','c','d','e','f'),
V2 = c('g', 'h', 'i', 'j', 'k', 'l'),
Vn = c('n1', 'n2','n3','n4','n5','n6'))
library(tidyverse)
df %>% group_by(ID) %>% mutate(Year = sort(Year)) %>%
mutate(rid = row_number()) %>%
pivot_wider(id_cols = ID, names_from = rid, values_from = c(Year:Vn), names_sep = '')
#> # A tibble: 3 x 9
#> # Groups: ID [3]
#> ID Year1 Year2 V11 V12 V21 V22 Vn1 Vn2
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1 bigger year.1 smaller year.1 a b g h n1 n2
#> 2 2 bigger year.2 smaller year.2 c d i j n3 n4
#> 3 3 same year.3 same year.3 e f k l n5 n6
Created on 2021-06-19 by the reprex package (v2.0.0)
library(tidyverse)
df %>% group_by(ID) %>% mutate(rid = row_number()) %>%
pivot_wider(id_cols = ID, names_from = rid, values_from = c(Year:Variable_n), names_sep = '')
# A tibble: 3 x 9
# Groups: ID [3]
ID Year1 Year2 Variable_a1 Variable_a2 Variable_b1 Variable_b2 Variable_n1 Variable_n2
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 1 smaller year.1 bigger year.1 va11 va12 vb11 vb12 vn11 vn12
2 2 bigger year.2 smaller year.2 va21 va22 vb21 vb22 vn21 vn22
3 3 same year.3 same year.3 va31 va32 vb31 vb32 vn31 vn32
Do u mean this?
df %>% group_by(ID) %>% arrange(desc(Year)) %>% mutate(rid = row_number()) %>%
pivot_wider(id_cols = ID, names_from = rid, values_from = c(Year:Variable_n), names_sep = '')
# A tibble: 3 x 9
# Groups: ID [3]
ID Year1 Year2 Variable_a1 Variable_a2 Variable_b1 Variable_b2 Variable_n1 Variable_n2
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 2 smaller year.2 bigger year.2 va22 va21 vb22 vb21 vn22 vn21
2 1 smaller year.1 bigger year.1 va11 va12 vb11 vb12 vn11 vn12
3 3 same year.3 same year.3 va31 va32 vb31 vb32 vn31 vn32
Related Topics
Existing Function for Seeing If a Row Exists in a Data Frame
Sequence Length Encoding Using R
Using R Convert Data.Frame to Simple Vector
Skip Some Rows in Read.CSV in R
How to Get the Nth Element of Each Item of a List, Which Is Itself a Vector of Unknown Length
Ggpairs Plot with Heatmap of Correlation Values
List Members Can Be Accessed with Partial Name? Is This a Feature
Dplyr Rowwise Sum and Other Functions Like Max
Combinations of Multiple Vectors in R
Apply Function to Elements Over a List
All Possible Combinations of a Set That Sum to a Target Value
Trouble Passing on an Argument to Function Within Own Function
How to Select Non-Numeric Columns Using Dplyr::Select_If
R Grep Pattern Regex with Brackets
Overlay Grid Rather Than Draw on Top of It
How to Italicize One Category in a Legend in Ggplot2
Add Data to Ggvis Tooltip That's Contained in the Input Dataset But Not Directly in the Vis