Pivot_Longer into Multiple Columns

pivot_longer into multiple columns

Here is solution following a similar method that @Fnguyen used but using the newer pivot_longer and pivot_wider construct:

library(dplyr)
library(tidyr)

longer<-pivot_longer(dat, cols=-1, names_pattern = "(.*)(..)$", names_to = c("limit", "name")) %>% 
     mutate(limit=ifelse(limit=="", "value", limit))

answer <-pivot_wider(longer, id_cols = c(group, name), names_from = limit, values_from = value, names_repair = "check_unique")

Most of the selecting, separating, mutating and renaming is taking place within the pivot function calls.

Update:
This regular expressions "(.*)(..)$" means:

( ) ( ) Look for two parts,

(.*) the first part should have zero or more characters

(..) the second part should have just 2 characters at the “$” end of the string

Is there way to pivot_longer to multiple values columns in R?

We don't need multiple calls if we specify the names_to as a vector of values i.e. .value - returns the value of the columns and 'group' the column with the suffix of column names. Here, we use names_sep as . to split at the .

library(tidyr)
pivot_longer(df, cols  = -ids, names_to = c(".value", "group"), 
    names_sep = "\\.")

-output

# A tibble: 4 × 4
  ids      group   mean    se
  <chr>    <chr>  <int> <int>
1 protein1 group1   982     3
2 protein1 group2   657     7
3 protein2 group1   663     9
4 protein2 group2   215     1

NOTE: values are different as sample was used in creation of input data without a set.seed specified

Pivot_longer for multiple columns of repeated measurements data

This probably adds nothing new to the already posted solutions, the only difference is the regex used for the names_pattern argument.

If you notice some of your column names are separated by one _ whereas others are separated by two _. \\w+ captures any word character, now if I specify we have a number after this with \\d+ as in time3 in time3_age, we tell pivot_longer to store this part of the column names corresponding to time3 in time column. Then the rest of the column names are used for the variable names we are trying to measure line age, systolicBP and med_hypt.
It should be noted that if we use \\w+\\d+ instead of \\w+ only the rest will be captured as column names whether it is med_hypt with underscore or systolicBP without underscore. But if we use only \\w+ it could also capture med and the resulting column will be hypt instead of med_hypt.
In the end since I defined two capture groups, I have to define either names_pattern or names_sep in a way to specify how each of them are defined and separated.

library(dplyr)

wide_data %>%
  pivot_longer(!c(id, sex), names_to = c("time", ".value"), 
               names_pattern = "(\\w+\\d+)_(\\w+)")

# A tibble: 30 x 6
      id sex   time    age systolicBP med_hypt
   <dbl> <fct> <chr> <dbl>      <dbl>    <dbl>
 1 12002 women time1  71.2       102         0
 2 12002 women time2  74.2        NA         0
 3 12002 women time3  78          NA         0
 4 17001 men   time1  67.9       152         0
 5 17001 men   time2  69.2       146         0
 6 17001 men   time3  74.2       160.        0
 7 17002 women time1  66.5        NA         0
 8 17002 women time2  67.8        NA         0
 9 17002 women time3  72.8        NA         0
10 42001 men   time1  57.7       170         0
# ... with 20 more rows

Pivoting multiple sets of columns using pivot_longer in R

The brackets around the matched pattern represents that we are capturing that pattern as a group. In the below code, we capture one or more lower-case letters ([a-z]+) followed by a _ (not inside the brackets, thus it is removed) and the second capture group matches one or more digits (\\d+), and this will be matched with the corresponding values of names_to - i.e. .value represents the value of the column, thus we get the columns 'x' and 'y' with the values and the second will be a new column names that returs the suffix digits of the column names i.e. 'time'

library(tidyr)
pivot_longer(data, cols = -aid, names_to = c(".value", "time"), 
    names_pattern = "^([a-z]+)_(\\d+)")

-output

# A tibble: 20 × 4
     aid time        x       y
   <int> <chr>   <dbl>   <dbl>
 1     1 1     -0.823   0.954 
 2     1 2      0.937   2.30  
 3     2 1      0.644   0.513 
 4     2 2     -0.281   0.0256
 5     3 1     -1.11    0.0575
 6     3 2     -0.248  -0.512 
 7     4 1     -1.04    0.578 
 8     4 2     -0.414   0.609 
 9     5 1      1.29    1.60  
10     5 2     -1.78    0.759 
11     1 1     -0.578   0.0430
12     1 2     -1.00    0.868 
13     2 1      0.0900 -2.10  
14     2 2     -0.795  -0.434 
15     3 1      0.143  -1.13  
16     3 2      0.420   0.145 
17     4 1     -0.252   0.236 
18     4 2      1.56   -0.0472
19     5 1     -0.256  -1.21  
20     5 2      0.624   1.02

In the OP's code, there are two groups ((.) and (.)) and only one element in names_to, thus it fails along with the fact that there is _ between the 'x', 'y' and the digit. Also, by default, the names_pattern will be in regex mode and some characters are thus in metacharacter mode i.e. . represents any character and not the literal .

Using pivot_longer with multiple column classes

We could use names_pattern after rearranging the substring in column names

library(dplyr)
library(tidyr)
library(stringr)
df_wide %>%
  # rename the columns by rearranging the digits at the end 
  # "_(\\d+)(_.*)" - captures the digits (\\d+) after the _
  # and the rest of the characters (_.*) 
  # replace with the backreference (\\2, \\1) of captured groups rearranged   
  rename_with(~ str_replace(., "_(\\d+)(_.*)", "\\2_\\1"), -resp_id) %>%
  pivot_longer(cols = -resp_id, names_to = c( ".value", "question_number"), 
        names_pattern = "(.*)_(\\d+$)")

-output

# A tibble: 6 × 4
  resp_id question_number question_info              question_answer
    <dbl> <chr>           <chr>                                <dbl>
1       1 1               "What is your eye color?"                1
2       1 2               "What is your hair color?"               2
3       2 1               "Are you over 6 ft tall?"                1
4       2 2               ""                                      NA
5       3 1               "What is your hair color?"               0
6       3 2               "Are you under 40?"                      1

pivot_longer into several pairs of columns

With tidyverse, we can pivot on the two sets of columns that starts with belief and norm. We can then use regex to split into groups according to the first underscore (since some column names have multiple underscores). Essentially, we are saying to put belief or norm (the first group in the column name) into their own columns (i.e., .value), then the second part of the group (i.e., animal names) are put into one column named animal.

library(tidyverse)

df_raw %>%
  pivot_longer(cols = c(starts_with("belief"), starts_with("norm")),
               names_to = c('.value', 'animal'),
               names_pattern = '(.*?)_(.*)') %>% 
  rename(belief_rating = belief, norm_rating = norm)

Output

  id      age gender animal    belief_rating norm_rating
  <chr> <dbl>  <dbl> <chr>             <dbl>       <dbl>
1 b2x8     41      2 dog                   1          10
2 b2x8     41      2 bull_frog             4           4
3 b2x8     41      2 fish                  3           2
4 m89w     19      1 dog                   3           3
5 m89w     19      1 bull_frog             6           1
6 m89w     19      1 fish                  2           2
7 32x8     38      3 dog                   1           8
8 32x8     38      3 bull_frog             5           9
9 32x8     38      3 fish                  2           1

pivot_longer multiple variables of different kinds

In this case one has to use names_to combined with names_pattern:

library(dplyr)
library(tidyr)
> head(x,3)
   case        X1990 flag.1990     X2000 flag.2000
1     1 0.2772497942         a 0.1751129         c
2     2 0.0005183129         b 0.4407503         d
3     3 0.5106083730         a 0.9071830         c
> x %>% 
    pivot_longer(cols = -case, 
                 names_to = c(".value", "year"), 
                 names_pattern = "([^\\.]*)\\.*(\\d{4})")
# A tibble: 20 x 4
    case year         X flag
   <int> <chr>    <dbl> <chr>
 1     1 1990  0.277    a    
 2     1 2000  0.175    c    
 3     2 1990  0.000518 b    
 4     2 2000  0.441    d    
 5     3 1990  0.511    a    
 6     3 2000  0.907    c    
 7     4 1990  0.0140   b    
 8     4 2000  0.851    d    
 9     5 1990  0.0647   a    
10     5 2000  0.734    c    
11     6 1990  0.955    b    
12     6 2000  0.574    d    
13     7 1990  0.0865   a    
14     7 2000  0.482    c    
15     8 1990  0.290    b    
16     8 2000  0.331    d    
17     9 1990  0.881    a    
18     9 2000  0.158    c    
19    10 1990  0.123    b    
20    10 2000  0.480    d

Pivot data into two different columns simultaneously using pivot_longer() in R?

Edit

Turns out, you can do it in one pivot_longer:

df %>% 
  pivot_longer(-id,
               names_to = c("variable", ".value"),
               names_pattern = "(.*)\\.(.*)")%>% 
  rename(activation = act, fixation = fix)

with the same result.

Don't know how to do it in one go, but you could use

library(tidyr)
library(dplyr)

df %>% 
  pivot_longer(-id,
               names_to = c("variable", "class"),
               names_pattern = "(.*)\\.(.*)") %>% 
  pivot_wider(names_from = "class") %>% 
  rename(activation = act, fixation = fix)

This returns

# A tibble: 4 x 4
     id variable activation fixation
  <dbl> <chr>         <dbl>    <dbl>
1     1 v1              0.4        1
2     1 v2              0.5        0
3     2 v1              0.8        0
4     2 v2              0.7        1

Pivot_longer to maintain two columns and make the rest long

If you want data in long format A, B to remain as it is remove them from cols :

library(dplyr)
library(tidyr)

df %>% 
  pivot_longer(cols = -c(A,B), names_to = 'Number', values_to = 'Value') %>% 
  type.convert(as.is = T) %>% 
  mutate(Variable = case_when(Number %in% c(1,2) ~ 'WW', 
                              Number %in% c(34,39) ~ 'MM', TRUE ~ 'EE')) %>%
  select(One = A, two = B, Number, Variable, Value)

# A tibble: 18 x 5
#   One   two   Number Variable Value
#   <chr> <chr>  <int> <chr>    <dbl>
# 1 A     AA         1 WW         1.9
# 2 A     AA         2 WW         1.9
# 3 A     AA        34 MM         3.9
# 4 A     AA        39 MM         2.9
# 5 A     AA       158 EE         2.9
# 6 A     AA       190 EE        22.1
# 7 B     BB         1 WW         6.8
# 8 B     BB         2 WW         6.8
# 9 B     BB        34 MM         0.3
#10 B     BB        39 MM         2.3
#11 B     BB       158 EE         3  
#12 B     BB       190 EE         7.4
#13 C     CC         1 WW         4.7
#14 C     CC         2 WW         4.7
#15 C     CC        34 MM         2.7
#16 C     CC        39 MM         2.9
#17 C     CC       158 EE        45  
#18 C     CC       190 EE        56

Pivot_Longer into Multiple Columns