Reshaping Wide to Long With Multiple Values Columns

Reshaping wide to long with multiple values columns

reshape does this with the appropriate arguments.

varying lists the columns which exist in the wide format, but are split into multiple rows in the long format. v.names is the long format equivalents. Between the two, a mapping is created.

From ?reshape:

Also, guessing is not attempted if v.names is given explicitly. Notice that the order of variables in varying is like x.1,y.1,x.2,y.2.

Given these varying and v.names arguments, reshape is smart enough to see that I've specified that the index is before the dot here (i.e., order 1.x, 1.y, 2.x, 2.y). Note that the original data has the columns in this order, so we can specify varying=2:5 for this example data, but that is not safe in general.

Given the values of times and v.names, reshape splits the varying columns on a . character (the default sep argument) to create the columns in the output.

times specifies values that are to be used in the created var column, and v.names are pasted onto these values to get column names in the wide format for mapping to the result.

Finally, idvar is specified to be the sbj column, which identifies individual records in the wide format (thanks @thelatemail).

reshape(dw, direction='long', 
        varying=c('f1.avg', 'f1.sd', 'f2.avg', 'f2.sd'), 
        timevar='var',
        times=c('f1', 'f2'),
        v.names=c('avg', 'sd'),
        idvar='sbj')

##      sbj blabla var avg sd
## A.f1   A     bA  f1  10  6
## B.f1   B     bB  f1  12  5
## C.f1   C     bC  f1  20  7
## D.f1   D     bD  f1  22  8
## A.f2   A     bA  f2  50 10
## B.f2   B     bB  f2  70 11
## C.f2   C     bC  f2  20  8
## D.f2   D     bD  f2  22  9

Data frame from wide to long with multiple variables and ids R

Answer already exists here: https://stackoverflow.com/a/12466668/2371031

e.g.,

set.seed(123)
wide_df = data.frame('participant_id' = LETTERS[1:12]
                     , 'judgment_1' = round(rnorm(12)*100)
                     , 'correct_1' = round(rnorm(12)*100)
                     , 'text_id_1' = sample(1:12, 12, replace = F)
                     , 'judgment_2' = round(rnorm(12)*100)
                     , 'correct_2' = round(rnorm(12)*100)
                     , 'text_id_2' = sample(13:24, 12, replace = F)
)

dl <- reshape(data = wide_df, 
              idvar = "participant_id", 
              varying = list(judgment=c(2,5),correct=c(3,6),text_id=c(4,7)), 
              direction="long", 
              v.names = c("judgment","correct","text_id"),
              sep="_")

Result:

    participant_id time judgment correct text_id
A.1              A    1      -56      40       4
B.1              B    1      -23      11      10
C.1              C    1      156     -56       1
D.1              D    1        7     179      12
E.1              E    1       13      50       7
F.1              F    1      172    -197      11
G.1              G    1       46      70       9
H.1              H    1     -127     -47       2
I.1              I    1      -69    -107       8
J.1              J    1      -45     -22       3
K.1              K    1      122    -103       5
L.1              L    1       36     -73       6
A.2              A    2       43    -127      17
B.2              B    2      -30     217      14
C.2              C    2       90     121      22
D.2              D    2       88    -112      15
E.2              E    2       82     -40      13
F.2              F    2       69     -47      19
G.2              G    2       55      78      24
H.2              H    2       -6      -8      20
I.2              I    2      -31      25      21
J.2              J    2      -38      -3      16
K.2              K    2      -69      -4      23
L.2              L    2      -21     137      18

Reshaping from long to wide with multiple columns

pivot_wider may be easier

library(dplyr)
library(stringr)
library(tidyr)
df %>% 
   mutate(time = str_c('t', time)) %>%
   pivot_wider(names_from = time, values_from = c(age, height))

-output

# A tibble: 2 × 5
    PIN age_t1 age_t2 height_t1 height_t2
  <dbl>  <dbl>  <dbl>     <dbl>     <dbl>
1  1001     84     86        58        58
2  1002     22     24        60        62

With reshape from base R, it may need a sequence column

out <- reshape(transform(df, rn = ave(seq_along(PIN), PIN,
   FUN = seq_along)), idvar = "PIN", 
    direction = "wide", timevar = "time", sep = "_")
out[!startsWith(names(out), 'rn_')]
   PIN age_1 height_1 age_2 height_2
1 1001    84       58    86       58
3 1002    22       60    24       62

Wide to Long format with multiple variables?

With gather from tidyr:

library(dplyr)
library(tidyr)

df %>%
  gather(Correct, CorrectValue, Correct1:Correct3) %>%
  gather(Percent, PercentValue, Percent1:Percent3) %>%
  mutate_at(vars(Correct, Percent), ~sub("[[:alpha:]]+", "", .))

Result:

   Subject Day Correct CorrectValue Percent PercentValue
1        1   1       1            1       1           50
2        2   1       1            1       1           75
3        3   1       1            0       1           70
4        4   1       1            0       1           80
5        5   1       1            1       1           90
6        1   2       1            0       1           30
7        2   2       1            0       1           45
8        3   2       1            1       1           50
9        4   2       1            1       1           60
10       5   2       1            1       1           80
11       1   1       2            0       1           50
12       2   1       2            0       1           75
13       3   1       2            1       1           70
14       4   1       2            1       1           80
15       5   1       2            1       1           90
16       1   2       2            1       1           30
17       2   2       2            0       1           45
18       3   2       2            1       1           50
19       4   2       2            0       1           60
20       5   2       2            1       1           80
21       1   1       3            1       1           50
22       2   1       3            0       1           75
23       3   1       3            1       1           70
24       4   1       3            0       1           80
25       5   1       3            1       1           90
...

R- How to reshape Long to Wide with multiple variables/columns

Some variables are can be better to together

df %>%
  pivot_wider(id_cols = c(UserID, Full.Name, DOB, EncounterID), names_from = c(QuestionID, QName, labelnospaces), values_from = responses)

  UserID Full.Name  DOB    EncounterID `505_Intro_Were you given any info?` `506_Care_By using this service..`
   <int> <chr>      <chr>        <int> <chr>                                <chr>                             
1      1 John Smith 1-1-90          13 yes                                  yes                               
2      2 Jane Doe   2-2-80          14 no                                   no                                
  `507_Out_How satisfied are you?`
  <chr>                           
1 vsat                            
2 unsat

Reshaping Wide to Long With Multiple Values Columns