Tidyverse Pivot_Longer Several Sets of Columns, But Avoid Intermediate Mutate_Wider Steps

tidyverse pivot_longer several sets of columns, but avoid intermediate mutate_wider steps

You can use :

tidyr::pivot_longer(dat, cols = -id, 
                   names_to = c('.value', 'brand'), 
                   names_sep = "_")


#  id    brand  Q1r1  Q1r2
#  <chr> <chr> <dbl> <dbl>
#1 A     pepsi     1     1
#2 A     cola      0     0
#3 B     pepsi     0     1
#4 B     cola      0     1
#5 C     pepsi     1     1
#6 C     cola      1     1

Pivot_wider with multiple (parallel) columns

library(dplyr)
library(tidyr)
pivot_longer(
  df, -c(NAME, ISIN),
  names_pattern = "(.*)([0-9])$", names_to = c("segm", ".value")
) %>%
  rename(region = "0", sales= "1")
# # A tibble: 4 x 5
#   NAME      ISIN         segm   region              sales
#   <chr>     <chr>        <chr>  <chr>               <dbl>
# 1 APPLE     US0378331005 WC1960 United States   109197000
# 2 APPLE     US0378331005 WC1961 Other Foreign   125010000
# 3 MICROSOFT US5949181045 WC1960 United States    83953000
# 4 MICROSOFT US5949181045 WC1961 Other countries  84135000

(You can add %>% mutate(segm = gsub(".*(.)$", "\\1", segm)) to clean up segm, if you need/want.)

Stack dataframe columns with two distinct suffix into two columns, preferably using tidyverse

We don't need both pivot_longer and pivot_wider. it can be done within pivot_longer itself by specifying the names_to and the names_sep argument

library(dplyr)
library(tidyr)
test.tbl %>% 
     pivot_longer(cols = everything(), names_to = c('grp', '.value'),
            names_sep = "_") %>%
     select(-grp)

-output

# A tibble: 40 x 2
#      diff    avg
#     <dbl>  <dbl>
# 1 -0.626   1.51 
# 2  0.919   1.36 
# 3 -0.165   0.398
# 4  2.40    0.476
# 5  0.184   0.390
# 6  0.782  -0.103
# 7 -0.253  -0.612
# 8 -0.0392 -0.710
# 9 -0.836  -0.621
#10  0.0746  0.388
# … with 30 more rows

complex(ish) pivot_longer in R

As deschen recommended, I played around with the regex a bit and this pivot longer call does work as expected. (and is a bit cleaner than having to manually create the pivot_longer_spec and also works if there are an unequal amount of mem_was_there_x and new_number_yn_y (it will just insert missings, where applicable)

df1 %>% 
  pivot_longer(
    cols = c(matches("^mem_was_there"), matches("^new_number_yn")),
    names_to = c('.value', 'nr'),
    names_pattern = "([A-Za-z_]+_)([0-9]*)")

How to use Pivot_longer to reshape from wide-type data to long-type data with multiple variables

You can try :

tidyr::pivot_longer(df, cols = -ID_IE, 
                    names_to = c('.value', 'grade'), 
                    names_pattern = '(.*)(\\d+)')

# A tibble: 8 x 4
#  ID_IE grade BLS_tchrG ELS_tchrG
#  <dbl> <chr>     <dbl>     <dbl>
#1  2135 2             1         1
#2  2135 7             1         1
#3  2101 2             0         0
#4  2101 7             2         0
#5  2103 2             0         0
#6  2103 7             3         0
#7  2111 2             1         1
#8  2111 7             4         1

data

Tried on this data :

df <- data.frame(ID_IE = c(2135, 2101, 2103, 2111), BLS_tchrG2 = c(1, 0, 0, 1), 
                 BLS_tchrG7 = 1:4,
                 ELS_tchrG2 = c(1, 0, 0, 1), ELS_tchrG7 = c(1, 0, 0, 1))

Long to wide format using variable names

We can usepivor_longer %>% pivot_wider. separateis not needed if we set the appropriate parameters to pivor_longer.

library(tidyr)

dataset %>%
        pivot_longer(cols = matches('time\\d+$'), names_to = c('sport', 'time'), names_pattern = '(.*)\\.(.*)') %>%
        pivot_wider(names_from = sport, values_from = value)

# A tibble: 15 × 5
      id time  basketball volleyball vollyeball
   <dbl> <chr>      <dbl>      <dbl>      <dbl>
 1     1 time1          2          2         NA
 2     1 time2          3          3         NA
 3     1 time3          1         NA          1
 4     2 time1          5          3         NA
 5     2 time2          4          4         NA
 6     2 time3          8         NA          8
 7     3 time1          4          4         NA
 8     3 time2          5          3         NA
 9     3 time3          4         NA         12
10     4 time1          3          0         NA
11     4 time2          3          1         NA
12     4 time3          3         NA          2
13     5 time1          3          1         NA
14     5 time2          2          3         NA
15     5 time3          1         NA          3

Achieving the equivalent of rbind using tidyr

We can use pivot_longer where we specify the names_sep as _ and the names_to with ".value" and a grouping name

library(dplyr)
library(tidyr)
df %>% 
    pivot_longer(cols = starts_with('var'), 
       names_to = c('grp', '.value'), names_sep="_")
#   group   day grp    mean    sd
#   <chr> <int> <chr> <dbl> <dbl>
# 1 a         1 var1  -0.63  5.49
# 2 a         1 var2  -0.62  5.82
# 3 a         2 var1   0.18  5.74
# 4 a         2 var2  -2.21  5.59
# 5 a         3 var1  -0.84  5.58
# 6 a         3 var2   1.12  5.92
# 7 b         1 var1   1.6   4.69
# 8 b         1 var2  -0.04  5.78
# 9 b         2 var1   0.33  6.51
#10 b         2 var2  -0.02  5.07
#11 b         3 var1  -0.82  5.39
#12 b         3 var2   0.94  3.01

we could remove the 'grp' column later

df %>% 
    pivot_longer(cols = starts_with('var'), 
       names_to = c('grp', '.value'), names_sep="_") %>%
    select(-grp)

Tidyverse Pivot_Longer Several Sets of Columns, But Avoid Intermediate Mutate_Wider Steps