tidyverse pivot_longer several sets of columns, but avoid intermediate mutate_wider steps
You can use :
tidyr::pivot_longer(dat, cols = -id,
names_to = c('.value', 'brand'),
names_sep = "_")
# id brand Q1r1 Q1r2
# <chr> <chr> <dbl> <dbl>
#1 A pepsi 1 1
#2 A cola 0 0
#3 B pepsi 0 1
#4 B cola 0 1
#5 C pepsi 1 1
#6 C cola 1 1
Pivot_wider with multiple (parallel) columns
library(dplyr)
library(tidyr)
pivot_longer(
df, -c(NAME, ISIN),
names_pattern = "(.*)([0-9])$", names_to = c("segm", ".value")
) %>%
rename(region = "0", sales= "1")
# # A tibble: 4 x 5
# NAME ISIN segm region sales
# <chr> <chr> <chr> <chr> <dbl>
# 1 APPLE US0378331005 WC1960 United States 109197000
# 2 APPLE US0378331005 WC1961 Other Foreign 125010000
# 3 MICROSOFT US5949181045 WC1960 United States 83953000
# 4 MICROSOFT US5949181045 WC1961 Other countries 84135000
(You can add %>% mutate(segm = gsub(".*(.)$", "\\1", segm))
to clean up segm
, if you need/want.)
Stack dataframe columns with two distinct suffix into two columns, preferably using tidyverse
We don't need both pivot_longer
and pivot_wider
. it can be done within pivot_longer
itself by specifying the names_to
and the names_sep
argument
library(dplyr)
library(tidyr)
test.tbl %>%
pivot_longer(cols = everything(), names_to = c('grp', '.value'),
names_sep = "_") %>%
select(-grp)
-output
# A tibble: 40 x 2
# diff avg
# <dbl> <dbl>
# 1 -0.626 1.51
# 2 0.919 1.36
# 3 -0.165 0.398
# 4 2.40 0.476
# 5 0.184 0.390
# 6 0.782 -0.103
# 7 -0.253 -0.612
# 8 -0.0392 -0.710
# 9 -0.836 -0.621
#10 0.0746 0.388
# … with 30 more rows
complex(ish) pivot_longer in R
As deschen recommended, I played around with the regex a bit and this pivot longer call does work as expected. (and is a bit cleaner than having to manually create the pivot_longer_spec
and also works if there are an unequal amount of mem_was_there_x and new_number_yn_y (it will just insert missings, where applicable)
df1 %>%
pivot_longer(
cols = c(matches("^mem_was_there"), matches("^new_number_yn")),
names_to = c('.value', 'nr'),
names_pattern = "([A-Za-z_]+_)([0-9]*)")
How to use Pivot_longer to reshape from wide-type data to long-type data with multiple variables
You can try :
tidyr::pivot_longer(df, cols = -ID_IE,
names_to = c('.value', 'grade'),
names_pattern = '(.*)(\\d+)')
# A tibble: 8 x 4
# ID_IE grade BLS_tchrG ELS_tchrG
# <dbl> <chr> <dbl> <dbl>
#1 2135 2 1 1
#2 2135 7 1 1
#3 2101 2 0 0
#4 2101 7 2 0
#5 2103 2 0 0
#6 2103 7 3 0
#7 2111 2 1 1
#8 2111 7 4 1
data
Tried on this data :
df <- data.frame(ID_IE = c(2135, 2101, 2103, 2111), BLS_tchrG2 = c(1, 0, 0, 1),
BLS_tchrG7 = 1:4,
ELS_tchrG2 = c(1, 0, 0, 1), ELS_tchrG7 = c(1, 0, 0, 1))
Long to wide format using variable names
We can usepivor_longer %>% pivot_wider
. separate
is not needed if we set the appropriate parameters to pivor_longer
.
library(tidyr)
dataset %>%
pivot_longer(cols = matches('time\\d+$'), names_to = c('sport', 'time'), names_pattern = '(.*)\\.(.*)') %>%
pivot_wider(names_from = sport, values_from = value)
# A tibble: 15 × 5
id time basketball volleyball vollyeball
<dbl> <chr> <dbl> <dbl> <dbl>
1 1 time1 2 2 NA
2 1 time2 3 3 NA
3 1 time3 1 NA 1
4 2 time1 5 3 NA
5 2 time2 4 4 NA
6 2 time3 8 NA 8
7 3 time1 4 4 NA
8 3 time2 5 3 NA
9 3 time3 4 NA 12
10 4 time1 3 0 NA
11 4 time2 3 1 NA
12 4 time3 3 NA 2
13 5 time1 3 1 NA
14 5 time2 2 3 NA
15 5 time3 1 NA 3
Achieving the equivalent of rbind using tidyr
We can use pivot_longer
where we specify the names_sep
as _
and the names_to
with ".value"
and a grouping name
library(dplyr)
library(tidyr)
df %>%
pivot_longer(cols = starts_with('var'),
names_to = c('grp', '.value'), names_sep="_")
# group day grp mean sd
# <chr> <int> <chr> <dbl> <dbl>
# 1 a 1 var1 -0.63 5.49
# 2 a 1 var2 -0.62 5.82
# 3 a 2 var1 0.18 5.74
# 4 a 2 var2 -2.21 5.59
# 5 a 3 var1 -0.84 5.58
# 6 a 3 var2 1.12 5.92
# 7 b 1 var1 1.6 4.69
# 8 b 1 var2 -0.04 5.78
# 9 b 2 var1 0.33 6.51
#10 b 2 var2 -0.02 5.07
#11 b 3 var1 -0.82 5.39
#12 b 3 var2 0.94 3.01
we could remove the 'grp' column later
df %>%
pivot_longer(cols = starts_with('var'),
names_to = c('grp', '.value'), names_sep="_") %>%
select(-grp)
Related Topics
Create Categorical Variable in R Based on Range
Function to Calculate R2 (R-Squared) in R
For Loop Over Dygraph Does Not Work in R
Convert Named Character Vector to Data.Frame
Count Number of Columns by a Condition (>) for Each Row
More Than One Value for "Each" Argument in "Rep" Function
How to Get the Classes of All Columns in a Data Frame
Why Do Some Unicode Characters Display in Matrices, But Not Data Frames in R
R Ggplot2 Merge with Shapefile and CSV Data to Fill Polygons
Anova Test Fails on Lme Fits Created with Pasted Formula
How to Read Only Lines That Fulfil a Condition from a CSV into R
How to Remove Empty Factors from Ggplot2 Facets
How to Access the Last Value in a Vector