Tidyr::Pivot_Wider() Reorder Column Names Grouping by 'Name_From'

tidyr::pivot_wider() reorder column names grouping by `name_from`

As far as I know, this can't be accomplished with pivot_wider and must be done afterwards.

Here is a long-winded attempt, but it does the job:

library(tidyverse)
suffixes <- unique(mtcars$gear)

pivoted <- mtcars %>%
tidyr::pivot_wider(names_from = gear, values_from = c(vs, am, carb))

names_to_order <- map(suffixes, ~ names(pivoted)[grep(paste0("_", .x), names(pivoted))]) %>% unlist
names_id <- setdiff(names(pivoted), names_to_order)

pivoted %>%
select(names_id, names_to_order)
#> # A tibble: 32 x 16
#> mpg cyl disp hp drat wt qsec vs_4 am_4 carb_4 vs_3 am_3
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 21 6 160 110 3.9 2.62 16.5 0 1 4 NA NA
#> 2 21 6 160 110 3.9 2.88 17.0 0 1 4 NA NA
#> 3 22.8 4 108 93 3.85 2.32 18.6 1 1 1 NA NA
#> 4 21.4 6 258 110 3.08 3.22 19.4 NA NA NA 1 0
#> 5 18.7 8 360 175 3.15 3.44 17.0 NA NA NA 0 0
#> 6 18.1 6 225 105 2.76 3.46 20.2 NA NA NA 1 0
#> 7 14.3 8 360 245 3.21 3.57 15.8 NA NA NA 0 0
#> 8 24.4 4 147. 62 3.69 3.19 20 1 0 2 NA NA
#> 9 22.8 4 141. 95 3.92 3.15 22.9 1 0 2 NA NA
#> 10 19.2 6 168. 123 3.92 3.44 18.3 1 0 4 NA NA
#> # ... with 22 more rows, and 4 more variables: carb_3 <dbl>, vs_5 <dbl>,
#> # am_5 <dbl>, carb_5 <dbl>

Created on 2020-02-25 by the reprex package (v0.3.0)

R: Changing column names in pivot_wider() -- suffix to prefix

You can use names_glue argument for this:

ex_wide <- ex_long %>%
pivot_wider(names_from = Phase, values_from = c(3:6), names_glue = "{Phase}_{.value}")

You simply pass Phase name and .value from specified columns with a _ separator.

Result

library(dplyr)
library(tidyr)

ex_wide <- ex_long %>%
pivot_wider(names_from = Phase, values_from = c(3:6), names_glue = "{Phase}_{.value}")

ex_wide
#> # A tibble: 3 x 13
#> ID P1_A P2_A P3_A P1_B P2_B P3_B P1_C P2_C P3_C P1_D P2_D P3_D
#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <chr> <chr>
#> 1 A950 23.5 25.2 21.9 21.9 21.1 20.3 25.2 21.9 20.3 M M M
#> 2 A970 21.9 21.1 20.3 19.5 18.7 17.9 17.6 15.1 12.7 F F F
#> 3 A996 19.5 18.7 17.9 17.1 16.3 15.5 10.3 7.8 5.4 N N N

Data

ex_long <- structure(list(ID = c("A950", "A950", "A950", "A970", "A970", 
"A970", "A996", "A996", "A996"), Phase = c("P1", "P2", "P3",
"P1", "P2", "P3", "P1", "P2", "P3"), A = c(23.5, 25.2, 21.9,
21.9, 21.1, 20.3, 19.5, 18.7, 17.9), B = c(21.9, 21.1, 20.3,
19.5, 18.7, 17.9, 17.1, 16.3, 15.5), C = c(25.2, 21.9, 20.3,
17.6, 15.1, 12.7, 10.3, 7.8, 5.4), D = c("M", "M", "M", "F",
"F", "F", "N", "N", "N")), class = "data.frame", row.names = c(NA,
-9L))

How can I edit the iteration order in pivot_wider in R

We can use str_sort

library(dplyr)
library(stringr)
df_wide %>%
select(LIWC_name, OCEAN, str_sort(names(.)[-(1:2)], numeric = TRUE))

Or another option is select-helpers

df_wide %>%
select(LIWC_name, OCEAN, starts_with('version'), starts_with('sample'))

-output

# A tibble: 2 x 8
# LIWC_name OCEAN version_group1 version_group2 version_group3 sample_group1 sample_group2 sample_group3
# <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 rc_WC O -0.02 -0.13 NA -0.12 0.001 -0.09
#2 rc_WC E 0.34 0.12 NA 0.04 0.08 0.33

pivot_wider when there's no names column (or when names column should be created)

We could create a new id column and then use pivot_wider.

library(dplyr)
df %>%
group_by(id) %>%
mutate(row = row_number()) %>%
select(-val) %>%
tidyr::pivot_wider(names_from = row, values_from = date, names_prefix = 'event')

# id event1 event2 event3
# <int> <date> <date> <date>
#1 1 2009-01-03 2009-01-04 NA
#2 2 2009-01-06 2009-01-07 NA
#3 3 2009-01-10 2009-01-11 2009-01-12

Using data.table :

library(data.table)
dcast(setDT(df), id~rowid(id), value.var = 'date')

Order of variable names pivot_wider

Edit: As of tidyr 1.1.0 the order of the variable names can be controlled with the names_glue argument:

us_rent_income %>%
pivot_wider(
names_from = NAME,
values_from = c(estimate, moe),
names_glue = "{NAME}_{.value}"
)

Old answer:

The documentation for pivot_wider() states "If values_from contains multiple values, the value will be added to the front of the output column" so there doesn't seem to be any way to control this as part of the reshape. Instead, it has to be done afterwards.

Assuming there are no other variable names in the dataset that contain _ (if so, the separator can be changed to something unique using the names_sep argument), one approach would be:

library(tidyr)

df <- us_rent_income %>%
pivot_wider(names_from = NAME,
values_from = c(estimate, moe)) %>%
setNames(nm = sub("(.*)_(.*)", "\\2_\\1", names(.)))

head(names(df))

[1] "GEOID" "variable" "Alabama_estimate" "Alaska_estimate" "Arizona_estimate" "Arkansas_estimate"

In tidyr::pivot_wider, `values_fn = sum(.,na.rm=TRUE)` failed

You can make it a function to handle this:

library(tidyverse)
test_data <- data.frame(
category=c('A','A','A','B','B','B'),
sub_category=c('a','b','b','a','b','b'),
amount=1:6
)

test_data %>% pivot_wider(names_from ='category',
values_from ='amount' ,
values_fn = function(x) sum(x, na.rm = TRUE))

#> # A tibble: 2 x 3
#> sub_category A B
#> <chr> <int> <int>
#> 1 a 1 4
#> 2 b 5 11

The new syntax for making an anonymous function (\(x)) works too:

test_data %>% pivot_wider(names_from ='category',
values_from ='amount' ,
values_fn = \(x) sum(x, na.rm = TRUE))

#> # A tibble: 2 x 3
#> sub_category A B
#> <chr> <int> <int>
#> 1 a 1 4
#> 2 b 5 11

Created on 2022-03-25 by the reprex package (v2.0.1)

Selectively Applying pivot_wider() Function

How about this solution?

Needed to update the names of the columns that ended with "_" and some polishing of the number's column. I'm sure this could be accomplished in a single line.

#rename columns that end with _
torename<-grep("(Emo._)$", names(df))
names(df)[torename] <- paste0(names(df)[torename], "Emo")

answer<- pivot_longer(df, cols= starts_with("Emo"), names_to=c( "Number", ".value"),
names_sep = "_", names_repair="unique")

#clean-up the Number column
answer$Number <- gsub("Emo", "", answer$Number)

answer
# A tibble: 8 × 7
PID Stage Keyword Number Emo Intense Desc
<chr> <chr> <chr> <chr> <chr> <int> <chr>
1 A-001 Beginning Bus 1 Fear 5 E
2 A-001 Beginning Bus 2 Content 1 A
3 A-002 End Ceiling 1 Sadness 6 F
4 A-002 End Ceiling 2 Depressed 2 B
5 A-003 Middle Chainsaw 1 Happy 7 G
6 A-003 Middle Chainsaw 2 Lost 3 C
7 A-004 Middle Floor 1 Anger 8 H
8 A-004 Middle Floor 2 Sad 4 D

How best to use R to reshape dataframe from long to wide and combine values

library(tidyverse)
df %>%
group_by(ID, Date) %>%
summarize(Procedure = paste0(Procedure, collapse = ", ")) %>%
mutate(col = row_number()) %>%
ungroup() %>%
pivot_wider(names_from = col, values_from = c(Date, Procedure))

This currently requires some reordering afterwards, which could be done like in this answer: https://stackoverflow.com/a/60400134/6851825

# A tibble: 4 x 7
ID Date_1 Date_2 Date_3 Procedure_1 Procedure_2 Procedure_3
<chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 A66 2/2/01 NA NA Sedation, Excision NA NA
2 D55 1/1/01 NA NA Sedation, Excision, Biopsy NA NA
3 G88 5/5/01 6/6/01 7/7/01 Sedation, Biopsy Sedation, Excision Sedation, Re-excision
4 T44 3/3/01 4/4/01 NA Sedation, Biopsy Sedation, Excision NA


Related Topics



Leave a reply



Submit