Reshaping wide to long with multiple values columns
reshape
does this with the appropriate arguments.
varying
lists the columns which exist in the wide format, but are split into multiple rows in the long format. v.names
is the long format equivalents. Between the two, a mapping is created.
From ?reshape
:
Also, guessing is not attempted if v.names is given explicitly. Notice that the order of variables in varying is like x.1,y.1,x.2,y.2.
Given these varying
and v.names
arguments, reshape
is smart enough to see that I've specified that the index is before the dot here (i.e., order 1.x, 1.y, 2.x, 2.y). Note that the original data has the columns in this order, so we can specify varying=2:5
for this example data, but that is not safe in general.
Given the values of times
and v.names
, reshape
splits the varying
columns on a .
character (the default sep
argument) to create the columns in the output.
times
specifies values that are to be used in the created var
column, and v.names
are pasted onto these values to get column names in the wide format for mapping to the result.
Finally, idvar
is specified to be the sbj
column, which identifies individual records in the wide format (thanks @thelatemail).
reshape(dw, direction='long',
varying=c('f1.avg', 'f1.sd', 'f2.avg', 'f2.sd'),
timevar='var',
times=c('f1', 'f2'),
v.names=c('avg', 'sd'),
idvar='sbj')
## sbj blabla var avg sd
## A.f1 A bA f1 10 6
## B.f1 B bB f1 12 5
## C.f1 C bC f1 20 7
## D.f1 D bD f1 22 8
## A.f2 A bA f2 50 10
## B.f2 B bB f2 70 11
## C.f2 C bC f2 20 8
## D.f2 D bD f2 22 9
Data frame from wide to long with multiple variables and ids R
Answer already exists here: https://stackoverflow.com/a/12466668/2371031
e.g.,
set.seed(123)
wide_df = data.frame('participant_id' = LETTERS[1:12]
, 'judgment_1' = round(rnorm(12)*100)
, 'correct_1' = round(rnorm(12)*100)
, 'text_id_1' = sample(1:12, 12, replace = F)
, 'judgment_2' = round(rnorm(12)*100)
, 'correct_2' = round(rnorm(12)*100)
, 'text_id_2' = sample(13:24, 12, replace = F)
)
dl <- reshape(data = wide_df,
idvar = "participant_id",
varying = list(judgment=c(2,5),correct=c(3,6),text_id=c(4,7)),
direction="long",
v.names = c("judgment","correct","text_id"),
sep="_")
Result:
participant_id time judgment correct text_id
A.1 A 1 -56 40 4
B.1 B 1 -23 11 10
C.1 C 1 156 -56 1
D.1 D 1 7 179 12
E.1 E 1 13 50 7
F.1 F 1 172 -197 11
G.1 G 1 46 70 9
H.1 H 1 -127 -47 2
I.1 I 1 -69 -107 8
J.1 J 1 -45 -22 3
K.1 K 1 122 -103 5
L.1 L 1 36 -73 6
A.2 A 2 43 -127 17
B.2 B 2 -30 217 14
C.2 C 2 90 121 22
D.2 D 2 88 -112 15
E.2 E 2 82 -40 13
F.2 F 2 69 -47 19
G.2 G 2 55 78 24
H.2 H 2 -6 -8 20
I.2 I 2 -31 25 21
J.2 J 2 -38 -3 16
K.2 K 2 -69 -4 23
L.2 L 2 -21 137 18
Reshaping from long to wide with multiple columns
pivot_wider
may be easier
library(dplyr)
library(stringr)
library(tidyr)
df %>%
mutate(time = str_c('t', time)) %>%
pivot_wider(names_from = time, values_from = c(age, height))
-output
# A tibble: 2 × 5
PIN age_t1 age_t2 height_t1 height_t2
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1001 84 86 58 58
2 1002 22 24 60 62
With reshape
from base R
, it may need a sequence column
out <- reshape(transform(df, rn = ave(seq_along(PIN), PIN,
FUN = seq_along)), idvar = "PIN",
direction = "wide", timevar = "time", sep = "_")
out[!startsWith(names(out), 'rn_')]
PIN age_1 height_1 age_2 height_2
1 1001 84 58 86 58
3 1002 22 60 24 62
Wide to Long format with multiple variables?
With gather
from tidyr
:
library(dplyr)
library(tidyr)
df %>%
gather(Correct, CorrectValue, Correct1:Correct3) %>%
gather(Percent, PercentValue, Percent1:Percent3) %>%
mutate_at(vars(Correct, Percent), ~sub("[[:alpha:]]+", "", .))
Result:
Subject Day Correct CorrectValue Percent PercentValue
1 1 1 1 1 1 50
2 2 1 1 1 1 75
3 3 1 1 0 1 70
4 4 1 1 0 1 80
5 5 1 1 1 1 90
6 1 2 1 0 1 30
7 2 2 1 0 1 45
8 3 2 1 1 1 50
9 4 2 1 1 1 60
10 5 2 1 1 1 80
11 1 1 2 0 1 50
12 2 1 2 0 1 75
13 3 1 2 1 1 70
14 4 1 2 1 1 80
15 5 1 2 1 1 90
16 1 2 2 1 1 30
17 2 2 2 0 1 45
18 3 2 2 1 1 50
19 4 2 2 0 1 60
20 5 2 2 1 1 80
21 1 1 3 1 1 50
22 2 1 3 0 1 75
23 3 1 3 1 1 70
24 4 1 3 0 1 80
25 5 1 3 1 1 90
...
R- How to reshape Long to Wide with multiple variables/columns
Some variables are can be better to together
df %>%
pivot_wider(id_cols = c(UserID, Full.Name, DOB, EncounterID), names_from = c(QuestionID, QName, labelnospaces), values_from = responses)
UserID Full.Name DOB EncounterID `505_Intro_Were you given any info?` `506_Care_By using this service..`
<int> <chr> <chr> <int> <chr> <chr>
1 1 John Smith 1-1-90 13 yes yes
2 2 Jane Doe 2-2-80 14 no no
`507_Out_How satisfied are you?`
<chr>
1 vsat
2 unsat
Related Topics
Pasting Two Vectors With Combinations of All Vectors' Elements
Reasons For Using the Set.Seed Function
Why Is Rbindlist "Better" Than Rbind
What Is Meaning of First Tilde in Purrr::Map
Rcpp Pass by Reference Vs. by Value
How to Omit Na Values While Pasting Numerous Column Values Together
Using Unicode 'Dingbat-Like' Glyphs in R Graphics, Across Devices & Platforms, Especially Pdf
How to Use Grep()/Gsub() to Find Exact Match
Selecting Only Numeric Columns from a Data Frame
Assign Multiple New Variables on Lhs in a Single Line
Yaml Current Date in Rmarkdown
Find How Many Times Duplicated Rows Repeat in R Data Frame
Dplyr: Inner_Join With a Partial String Match
Cumulatively Paste (Concatenate) Values Grouped by Another Variable