How to use Pivot_longer to reshape from wide-type data to long-type data with multiple variables
You can try :
tidyr::pivot_longer(df, cols = -ID_IE,
names_to = c('.value', 'grade'),
names_pattern = '(.*)(\\d+)')
# A tibble: 8 x 4
# ID_IE grade BLS_tchrG ELS_tchrG
# <dbl> <chr> <dbl> <dbl>
#1 2135 2 1 1
#2 2135 7 1 1
#3 2101 2 0 0
#4 2101 7 2 0
#5 2103 2 0 0
#6 2103 7 3 0
#7 2111 2 1 1
#8 2111 7 4 1
data
Tried on this data :
df <- data.frame(ID_IE = c(2135, 2101, 2103, 2111), BLS_tchrG2 = c(1, 0, 0, 1),
BLS_tchrG7 = 1:4,
ELS_tchrG2 = c(1, 0, 0, 1), ELS_tchrG7 = c(1, 0, 0, 1))
How to pivot from wide to long format based on multiple column name separators?
I experimented with the results I produced in the earlier solution and pivot_wider then pivot_longer and found how to make it work for pivot_longer.Your original approach was very close.
dat %>%
pivot_longer(
cols = !c(part, type, sp),
names_to = c("var", "site", "side", "misc"),
names_sep = "_",
values_to = "value"
)
part type sp var site side misc value
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
1 P1 pre slow var1 site1 L NA 1
2 P1 pre slow var1 site1 R NA 1
3 P1 pre slow var1 site1 ALL NA 1
4 P1 pre slow var1 site1 ALL M 1
5 P1 pre slow var2 site2 L NA 1
6 P1 pre slow var2 site2 R NA 1
7 P1 pre slow var2 site2 ALL NA 1
8 P1 pre slow var2 site2 ALL M 1
9 P1 pre slow var1 site1 L NA 1
10 P1 pre slow var1 site1 R NA 1
Correctly getting data from wide to long using pivot_longer
You can use regex to do this -
tidyr::pivot_longer(df,
cols = -Player,
names_to = c('name', '.value'),
names_pattern = '(.*)\\.(.*)')
# Player name `2020` not2020
# <chr> <chr> <dbl> <dbl>
#1 Tom Brady completetion.rank 1 NA
#2 Tom Brady completion.rank NA 0.375
#3 Tom Brady ypc.rank 1 0.375
#4 Tom Brady ypc.td 1 0.312
#5 Tom Brady ypc.int 0.25 0.375
#6 Tom Brady ypc.sack 0 0.625
Basically everything until the last .
is captured in name
column and the rest is used to create new column.
Reshape wide data to long with multiple variables in R (dplyr)
You could use names_pattern
argument in pivot_longer
.
tidyr::pivot_longer(df,
cols = -id,
names_to = c('wave', '.value'),
names_pattern = 'c(\\d+)(.*)')
# id wave sports smoker drinker
# <int> <chr> <int> <int> <int>
# 1 1 1 1 1 1
# 2 1 2 1 1 5
# 3 1 3 1 4 2
# 4 2 1 1 5 4
# 5 2 2 1 1 1
# 6 2 3 1 3 4
# 7 3 1 1 1 2
# 8 3 2 0 1 3
# 9 3 3 0 5 2
#10 4 1 0 1 4
# … with 20 more rows
pivot_longer with names_pattern
You can provide names_pattern
regex as :
tidyr::pivot_longer(df,
cols = -Year,
names_to = c('Nutrient', '.value'),
names_pattern = '(.*)_(\\w+)')
# Year Nutrient Production Import
# <dbl> <chr> <dbl> <dbl>
#1 1961 Total_Energy_kcal 5 6
#2 1961 Total_Ca_g 3 3
#3 1962 Total_Energy_kcal 8 1
#4 1962 Total_Ca_g 4 8
This will put everything until the last underscore in Nutrient
column and the remaining data is kept as column name.
data
cbind
will create a matrix, use data.frame
to create data.
df<-data.frame(Year,Total_Energy_kcal_Production,Total_Energy_kcal_Import,
Total_Ca_g_Production, Total_Ca_g_Import)
Get long format in a special way
We can use pivot_longer
from tidyr
:
tidyr::pivot_longer(Data,
cols = -ID,
names_to = '.value',
names_pattern = '([A-Za-z]+)')
# ID N E Class
# <int> <dbl> <dbl> <dbl>
# 1 1 0 0 1
# 2 1 5 6 2
# 3 2 -1 -2 1
# 4 2 6 6 2
# 5 3 2 0 1
# 6 3 6 5 2
# 7 4 0 0 1
# 8 4 4 6 2
# 9 5 -2 1 1
#10 5 5 6 2
#11 6 -1 0 1
#12 6 6 5 2
.value
has a special meaning in pivot_longer
which means that the new columns in the long format would have names from the original column names. How are those name derived is defined using names_pattern
argument. In names_pattern
we mentioned that extract all the characters ([A-Za-z]+
) from the name as new name. So N1
, N2
become N
and they are combined into one column. Same happens with E1
, E2
and Class1
, Class2
pair.
Create a long format dataframe based on multiple variables
You can try pivot_longer
as -
tidyr::pivot_longer(my_data,
cols = starts_with('VAR'),
names_to = '.value',
names_pattern = '(VAR\\d+)')
# Main VAR1 VAR2
# <chr> <chr> <dbl>
#1 A "B" 1
#2 A "C" 1
#3 A "D" 1
#4 B "A" 1
#5 B "D" 2
#6 B "" NA
#7 C "D" 2
#8 C "A" 1
#9 C "" NA
How to turn data from long to wide format so that duplicate rows get added to the end to make new columns in R?
library(tidyverse)
df <- read.table(
text = "ID NAME STATUS OKR_T OKR_N NR
1 Jack 1 34 OK1 0
1 Jack 1 433 OK2 0
1 Jack 1 12 OK3 1
2 Bill 2 34 OK1 1
3 Steve 1 433 OK2 1
3 Steve 1 34 OK1 0
3 Steve 1 45 OK4 0",
header = T
)
df %>%
group_by(ID) %>%
mutate(rid = row_number()) %>%
pivot_wider(
id_cols = c(ID, NAME, STATUS),
names_from = rid,
values_from = c(OKR_T, OKR_N, NR)
)
#> # A tibble: 3 x 12
#> # Groups: ID [3]
#> ID NAME STATUS OKR_T_1 OKR_T_2 OKR_T_3 OKR_N_1 OKR_N_2 OKR_N_3 NR_1 NR_2
#> <int> <chr> <int> <int> <int> <int> <chr> <chr> <chr> <int> <int>
#> 1 1 Jack 1 34 433 12 OK1 OK2 OK3 0 0
#> 2 2 Bill 2 34 NA NA OK1 <NA> <NA> 1 NA
#> 3 3 Steve 1 433 34 45 OK2 OK1 OK4 1 0
#> # ... with 1 more variable: NR_3 <int>
Created on 2021-09-08 by the reprex package (v2.0.1)
Related Topics
Aggregate/Summarize Multiple Variables Per Group (E.G. Sum, Mean)
Counting Unique Values Across Variables (Columns) in R
How to Generate a Histogram for Each Column of My Table
Coerce Multiple Columns to Factors At Once
Change Rows into Columns in R With Values Yes/No (1/0)
How to Generate the First N Terms in the Series:
Column Name Changes in R for Loop for Defined Data Frame
Installing Rgl on Ubuntu and Mac: X11 Not Found
How to Reshape Data from Long to Wide Format
How to Sum a Variable by Group
Dynamically Select Data Frame Columns Using $ and a Character Value
How to Convert a Factor to Integer\Numeric Without Loss of Information
Transpose/Reshape Dataframe Without "Timevar" from Long to Wide Format
How to Deal With "Package 'Xxx' Is Not Available (For R Version X.Y.Z)" Warning
Split Data Frame String Column into Multiple Columns
Get the Difference Between Dates in Terms of Weeks, Months, Quarters, and Years
Cleaning Up Factor Levels (Collapsing Multiple Levels/Labels)