How to Use Pivot_Longer to Reshape from Wide-Type Data to Long-Type Data With Multiple Variables

How to use Pivot_longer to reshape from wide-type data to long-type data with multiple variables

You can try :

tidyr::pivot_longer(df, cols = -ID_IE, 
names_to = c('.value', 'grade'),
names_pattern = '(.*)(\\d+)')

# A tibble: 8 x 4
# ID_IE grade BLS_tchrG ELS_tchrG
# <dbl> <chr> <dbl> <dbl>
#1 2135 2 1 1
#2 2135 7 1 1
#3 2101 2 0 0
#4 2101 7 2 0
#5 2103 2 0 0
#6 2103 7 3 0
#7 2111 2 1 1
#8 2111 7 4 1

data

Tried on this data :

df <- data.frame(ID_IE = c(2135, 2101, 2103, 2111), BLS_tchrG2 = c(1, 0, 0, 1), 
BLS_tchrG7 = 1:4,
ELS_tchrG2 = c(1, 0, 0, 1), ELS_tchrG7 = c(1, 0, 0, 1))

How to pivot from wide to long format based on multiple column name separators?

I experimented with the results I produced in the earlier solution and pivot_wider then pivot_longer and found how to make it work for pivot_longer.Your original approach was very close.

 dat %>%
pivot_longer(
cols = !c(part, type, sp),
names_to = c("var", "site", "side", "misc"),
names_sep = "_",
values_to = "value"
)

part type sp var site side misc value
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
1 P1 pre slow var1 site1 L NA 1
2 P1 pre slow var1 site1 R NA 1
3 P1 pre slow var1 site1 ALL NA 1
4 P1 pre slow var1 site1 ALL M 1
5 P1 pre slow var2 site2 L NA 1
6 P1 pre slow var2 site2 R NA 1
7 P1 pre slow var2 site2 ALL NA 1
8 P1 pre slow var2 site2 ALL M 1
9 P1 pre slow var1 site1 L NA 1
10 P1 pre slow var1 site1 R NA 1

Correctly getting data from wide to long using pivot_longer

You can use regex to do this -

tidyr::pivot_longer(df, 
cols = -Player,
names_to = c('name', '.value'),
names_pattern = '(.*)\\.(.*)')

# Player name `2020` not2020
# <chr> <chr> <dbl> <dbl>
#1 Tom Brady completetion.rank 1 NA
#2 Tom Brady completion.rank NA 0.375
#3 Tom Brady ypc.rank 1 0.375
#4 Tom Brady ypc.td 1 0.312
#5 Tom Brady ypc.int 0.25 0.375
#6 Tom Brady ypc.sack 0 0.625

Basically everything until the last . is captured in name column and the rest is used to create new column.

Reshape wide data to long with multiple variables in R (dplyr)

You could use names_pattern argument in pivot_longer.

tidyr::pivot_longer(df, 
cols = -id,
names_to = c('wave', '.value'),
names_pattern = 'c(\\d+)(.*)')

# id wave sports smoker drinker
# <int> <chr> <int> <int> <int>
# 1 1 1 1 1 1
# 2 1 2 1 1 5
# 3 1 3 1 4 2
# 4 2 1 1 5 4
# 5 2 2 1 1 1
# 6 2 3 1 3 4
# 7 3 1 1 1 2
# 8 3 2 0 1 3
# 9 3 3 0 5 2
#10 4 1 0 1 4
# … with 20 more rows

pivot_longer with names_pattern

You can provide names_pattern regex as :

tidyr::pivot_longer(df, 
cols = -Year,
names_to = c('Nutrient', '.value'),
names_pattern = '(.*)_(\\w+)')

# Year Nutrient Production Import
# <dbl> <chr> <dbl> <dbl>
#1 1961 Total_Energy_kcal 5 6
#2 1961 Total_Ca_g 3 3
#3 1962 Total_Energy_kcal 8 1
#4 1962 Total_Ca_g 4 8

This will put everything until the last underscore in Nutrient column and the remaining data is kept as column name.

data

cbind will create a matrix, use data.frame to create data.

df<-data.frame(Year,Total_Energy_kcal_Production,Total_Energy_kcal_Import, 
Total_Ca_g_Production, Total_Ca_g_Import)

Get long format in a special way

We can use pivot_longer from tidyr :

tidyr::pivot_longer(Data, 
cols = -ID,
names_to = '.value',
names_pattern = '([A-Za-z]+)')

# ID N E Class
# <int> <dbl> <dbl> <dbl>
# 1 1 0 0 1
# 2 1 5 6 2
# 3 2 -1 -2 1
# 4 2 6 6 2
# 5 3 2 0 1
# 6 3 6 5 2
# 7 4 0 0 1
# 8 4 4 6 2
# 9 5 -2 1 1
#10 5 5 6 2
#11 6 -1 0 1
#12 6 6 5 2

.value has a special meaning in pivot_longer which means that the new columns in the long format would have names from the original column names. How are those name derived is defined using names_pattern argument. In names_pattern we mentioned that extract all the characters ([A-Za-z]+) from the name as new name. So N1, N2 become N and they are combined into one column. Same happens with E1, E2 and Class1, Class2 pair.

Create a long format dataframe based on multiple variables

You can try pivot_longer as -

tidyr::pivot_longer(my_data, 
cols = starts_with('VAR'),
names_to = '.value',
names_pattern = '(VAR\\d+)')

# Main VAR1 VAR2
# <chr> <chr> <dbl>
#1 A "B" 1
#2 A "C" 1
#3 A "D" 1
#4 B "A" 1
#5 B "D" 2
#6 B "" NA
#7 C "D" 2
#8 C "A" 1
#9 C "" NA

How to turn data from long to wide format so that duplicate rows get added to the end to make new columns in R?

library(tidyverse)
df <- read.table(
text = "ID NAME STATUS OKR_T OKR_N NR
1 Jack 1 34 OK1 0
1 Jack 1 433 OK2 0
1 Jack 1 12 OK3 1
2 Bill 2 34 OK1 1
3 Steve 1 433 OK2 1
3 Steve 1 34 OK1 0
3 Steve 1 45 OK4 0",
header = T
)
df %>%
group_by(ID) %>%
mutate(rid = row_number()) %>%
pivot_wider(
id_cols = c(ID, NAME, STATUS),
names_from = rid,
values_from = c(OKR_T, OKR_N, NR)
)
#> # A tibble: 3 x 12
#> # Groups: ID [3]
#> ID NAME STATUS OKR_T_1 OKR_T_2 OKR_T_3 OKR_N_1 OKR_N_2 OKR_N_3 NR_1 NR_2
#> <int> <chr> <int> <int> <int> <int> <chr> <chr> <chr> <int> <int>
#> 1 1 Jack 1 34 433 12 OK1 OK2 OK3 0 0
#> 2 2 Bill 2 34 NA NA OK1 <NA> <NA> 1 NA
#> 3 3 Steve 1 433 34 45 OK2 OK1 OK4 1 0
#> # ... with 1 more variable: NR_3 <int>

Created on 2021-09-08 by the reprex package (v2.0.1)



Related Topics



Leave a reply



Submit