tidyr::pivot_wider() reorder column names grouping by `name_from`
As far as I know, this can't be accomplished with pivot_wider
and must be done afterwards.
Here is a long-winded attempt, but it does the job:
library(tidyverse)
suffixes <- unique(mtcars$gear)
pivoted <- mtcars %>%
tidyr::pivot_wider(names_from = gear, values_from = c(vs, am, carb))
names_to_order <- map(suffixes, ~ names(pivoted)[grep(paste0("_", .x), names(pivoted))]) %>% unlist
names_id <- setdiff(names(pivoted), names_to_order)
pivoted %>%
select(names_id, names_to_order)
#> # A tibble: 32 x 16
#> mpg cyl disp hp drat wt qsec vs_4 am_4 carb_4 vs_3 am_3
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 NA NA
#> 2 21 6 160 110 3.9 2.88 17.0 0 1 4 NA NA
#> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 1 NA NA
#> 4 21.4 6 258 110 3.08 3.22 19.4 NA NA NA 1 0
#> 5 18.7 8 360 175 3.15 3.44 17.0 NA NA NA 0 0
#> 6 18.1 6 225 105 2.76 3.46 20.2 NA NA NA 1 0
#> 7 14.3 8 360 245 3.21 3.57 15.8 NA NA NA 0 0
#> 8 24.4 4 147. 62 3.69 3.19 20 1 0 2 NA NA
#> 9 22.8 4 141. 95 3.92 3.15 22.9 1 0 2 NA NA
#> 10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 NA NA
#> # ... with 22 more rows, and 4 more variables: carb_3 <dbl>, vs_5 <dbl>,
#> # am_5 <dbl>, carb_5 <dbl>
Created on 2020-02-25 by the reprex package (v0.3.0)
R: Changing column names in pivot_wider() -- suffix to prefix
You can use names_glue
argument for this:
ex_wide <- ex_long %>%
pivot_wider(names_from = Phase, values_from = c(3:6), names_glue = "{Phase}_{.value}")
You simply pass Phase
name and .value
from specified columns with a _
separator.
Result
library(dplyr)
library(tidyr)
ex_wide <- ex_long %>%
pivot_wider(names_from = Phase, values_from = c(3:6), names_glue = "{Phase}_{.value}")
ex_wide
#> # A tibble: 3 x 13
#> ID P1_A P2_A P3_A P1_B P2_B P3_B P1_C P2_C P3_C P1_D P2_D P3_D
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr>
#> 1 A950 23.5 25.2 21.9 21.9 21.1 20.3 25.2 21.9 20.3 M M M
#> 2 A970 21.9 21.1 20.3 19.5 18.7 17.9 17.6 15.1 12.7 F F F
#> 3 A996 19.5 18.7 17.9 17.1 16.3 15.5 10.3 7.8 5.4 N N N
Data
ex_long <- structure(list(ID = c("A950", "A950", "A950", "A970", "A970",
"A970", "A996", "A996", "A996"), Phase = c("P1", "P2", "P3",
"P1", "P2", "P3", "P1", "P2", "P3"), A = c(23.5, 25.2, 21.9,
21.9, 21.1, 20.3, 19.5, 18.7, 17.9), B = c(21.9, 21.1, 20.3,
19.5, 18.7, 17.9, 17.1, 16.3, 15.5), C = c(25.2, 21.9, 20.3,
17.6, 15.1, 12.7, 10.3, 7.8, 5.4), D = c("M", "M", "M", "F",
"F", "F", "N", "N", "N")), class = "data.frame", row.names = c(NA,
-9L))
How can I edit the iteration order in pivot_wider in R
We can use str_sort
library(dplyr)
library(stringr)
df_wide %>%
select(LIWC_name, OCEAN, str_sort(names(.)[-(1:2)], numeric = TRUE))
Or another option is select-helpers
df_wide %>%
select(LIWC_name, OCEAN, starts_with('version'), starts_with('sample'))
-output
# A tibble: 2 x 8
# LIWC_name OCEAN version_group1 version_group2 version_group3 sample_group1 sample_group2 sample_group3
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 rc_WC O -0.02 -0.13 NA -0.12 0.001 -0.09
#2 rc_WC E 0.34 0.12 NA 0.04 0.08 0.33
pivot_wider when there's no names column (or when names column should be created)
We could create a new id column and then use pivot_wider
.
library(dplyr)
df %>%
group_by(id) %>%
mutate(row = row_number()) %>%
select(-val) %>%
tidyr::pivot_wider(names_from = row, values_from = date, names_prefix = 'event')
# id event1 event2 event3
# <int> <date> <date> <date>
#1 1 2009-01-03 2009-01-04 NA
#2 2 2009-01-06 2009-01-07 NA
#3 3 2009-01-10 2009-01-11 2009-01-12
Using data.table
:
library(data.table)
dcast(setDT(df), id~rowid(id), value.var = 'date')
Order of variable names pivot_wider
Edit: As of tidyr 1.1.0
the order of the variable names can be controlled with the names_glue
argument:
us_rent_income %>%
pivot_wider(
names_from = NAME,
values_from = c(estimate, moe),
names_glue = "{NAME}_{.value}"
)
Old answer:
The documentation for pivot_wider()
states "If values_from
contains multiple values, the value will be added to the front of the output column" so there doesn't seem to be any way to control this as part of the reshape. Instead, it has to be done afterwards.
Assuming there are no other variable names in the dataset that contain _
(if so, the separator can be changed to something unique using the names_sep
argument), one approach would be:
library(tidyr)
df <- us_rent_income %>%
pivot_wider(names_from = NAME,
values_from = c(estimate, moe)) %>%
setNames(nm = sub("(.*)_(.*)", "\\2_\\1", names(.)))
head(names(df))
[1] "GEOID" "variable" "Alabama_estimate" "Alaska_estimate" "Arizona_estimate" "Arkansas_estimate"
In tidyr::pivot_wider, `values_fn = sum(.,na.rm=TRUE)` failed
You can make it a function to handle this:
library(tidyverse)
test_data <- data.frame(
category=c('A','A','A','B','B','B'),
sub_category=c('a','b','b','a','b','b'),
amount=1:6
)
test_data %>% pivot_wider(names_from ='category',
values_from ='amount' ,
values_fn = function(x) sum(x, na.rm = TRUE))
#> # A tibble: 2 x 3
#> sub_category A B
#> <chr> <int> <int>
#> 1 a 1 4
#> 2 b 5 11
The new syntax for making an anonymous function (\(x)
) works too:
test_data %>% pivot_wider(names_from ='category',
values_from ='amount' ,
values_fn = \(x) sum(x, na.rm = TRUE))
#> # A tibble: 2 x 3
#> sub_category A B
#> <chr> <int> <int>
#> 1 a 1 4
#> 2 b 5 11
Created on 2022-03-25 by the reprex package (v2.0.1)
Selectively Applying pivot_wider() Function
How about this solution?
Needed to update the names of the columns that ended with "_" and some polishing of the number's column. I'm sure this could be accomplished in a single line.
#rename columns that end with _
torename<-grep("(Emo._)$", names(df))
names(df)[torename] <- paste0(names(df)[torename], "Emo")
answer<- pivot_longer(df, cols= starts_with("Emo"), names_to=c( "Number", ".value"),
names_sep = "_", names_repair="unique")
#clean-up the Number column
answer$Number <- gsub("Emo", "", answer$Number)
answer
# A tibble: 8 × 7
PID Stage Keyword Number Emo Intense Desc
<chr> <chr> <chr> <chr> <chr> <int> <chr>
1 A-001 Beginning Bus 1 Fear 5 E
2 A-001 Beginning Bus 2 Content 1 A
3 A-002 End Ceiling 1 Sadness 6 F
4 A-002 End Ceiling 2 Depressed 2 B
5 A-003 Middle Chainsaw 1 Happy 7 G
6 A-003 Middle Chainsaw 2 Lost 3 C
7 A-004 Middle Floor 1 Anger 8 H
8 A-004 Middle Floor 2 Sad 4 D
How best to use R to reshape dataframe from long to wide and combine values
library(tidyverse)
df %>%
group_by(ID, Date) %>%
summarize(Procedure = paste0(Procedure, collapse = ", ")) %>%
mutate(col = row_number()) %>%
ungroup() %>%
pivot_wider(names_from = col, values_from = c(Date, Procedure))
This currently requires some reordering afterwards, which could be done like in this answer: https://stackoverflow.com/a/60400134/6851825
# A tibble: 4 x 7
ID Date_1 Date_2 Date_3 Procedure_1 Procedure_2 Procedure_3
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 A66 2/2/01 NA NA Sedation, Excision NA NA
2 D55 1/1/01 NA NA Sedation, Excision, Biopsy NA NA
3 G88 5/5/01 6/6/01 7/7/01 Sedation, Biopsy Sedation, Excision Sedation, Re-excision
4 T44 3/3/01 4/4/01 NA Sedation, Biopsy Sedation, Excision NA
Related Topics
How to Add Legend to Geom_Smooth in Ggplot in R
Max and Min Functions That Are Similar to Colmeans
1-Dimensional Matrix Is Changed to a Vector in R
Extract Last Non-Missing Value in Row with Data.Table
Repeat the Re-Sampling Function for 1000 Times? Using Lapply
R - Svd() Function - Infinite or Missing Values in 'X'
Accessing Y Columns with Duplicated Names in J of X[Y, J] Merges
Export Both Image and Data from R to an Excel Spreadsheet
Extract Data Between a Pattern from a Text File
In R, How to Plot into a Memory Buffer Instead of a File
Fastest Way to Sort Each Row of a Large Matrix in R
Are Data Tables with More Than 2^31 Rows Supported in R with the Data Table Package Yet
3D Equivalent of the Curve Function in R
Change the Order of Stacked Fill Columns in Ggplot2