Pivot Wider Produces Nested Object

Why does pivot_wider produce nested objects/lists and NULLs?

Well, unchop(everything()) did a great job to me to resolve this problem. I will not delete my question as it may help some other users in future.

test_df %>%
pivot_wider(names_from = pair, values_from = delta) %>%
unchop(everything())

unchop() makes df longer by expanding list-columns so that each
element of the list-column gets its own row in the output.

Source: https://tidyr.tidyverse.org/reference/chop.html

How can I unlist the result of pivot wider in R

We can use unnest from tidyr

library(dplyr)
library(tidyr)
data_bad %>%
unnest

Show duplicate value on a separate row in pivot wider

You can use -

library(dplyr)
library(tidyr)

df %>%
pivot_wider(names_from=id,values_from=value, values_fn = list) %>%
unnest(cols = c(CP1, CP2))

# identifier label CP1 CP2
# <chr> <chr> <int> <int>
#1 e1 Monaco 0 1
#2 e1 became 0 0
#3 e2 the 1 1
#4 e2 first 0 0
#5 e1 the 10 NA
#6 e1 the 1 NA

You were close with your attempt as well, you had to include id in group_by -

df %>%
group_by(identifier,label, id) %>%
mutate(rn=row_number()) %>%
pivot_wider(names_from=id,values_from=value)

pivot wider while renaming

I see two steps that might be handled before transforming the table into a wide format.

  1. The year column has to be created by the number inside the name column
  2. In the name column the numbers has to be removed.

Code

df  %>% 
mutate(year = 2020 - replace_na(as.numeric(str_extract(name, "\\d")), 0),
name = str_remove(name, "\\d")) %>%
pivot_wider(id_cols = year,
names_from = name,
values_from = value) %>%
mutate(ID = row_number())

# A tibble: 5 x 4
year PROD_N PROD_N_NATIONAL ID
<dbl> <dbl> <dbl> <int>
1 2020 NA 0.789 1
2 2019 0.5 0.2 2
3 2018 0.4 0.3 3
4 2017 NA 0.221 4
5 2016 0.2 0.39 5

Data

df <- structure(list(id = 1:10, name = structure(1:10, .Label = c("PROD_N", 
"PROD_N_NATIONAL", "PROD_N1", "PROD_N1_NATIONAL", "PROD_N2",
"PROD_N2_NATIONAL", "PROD_N3", "PROD_N3_NATIONAL", "PROD_N4",
"PROD_N4_NATIONAL"), class = "factor"), value = c(NA, 0.789,
0.5, 0.2, 0.4, 0.3, NA, 0.221, 0.2, 0.39)), class = "data.frame", row.names = c(NA,
-10L))

Plot

Plotting two y variables in ggplot depends on what kind of plot you need.

If you need two different geom_x you can do something like this:

ggplot(df_wide) +
geom_line(aes(y = PROD_N, x = year)) +
geom_point(aes(y = PROD_N_NATIONAL, x = year))

If you want the same geom_x but color the names differently, than you might need the long format:

df  %>% 
mutate(year = 2020 - replace_na(as.numeric(str_extract(name, "\\d")), 0),
name = str_remove(name, "\\d")) %>%
ggplot() +
geom_line(aes(y = value, x = year, color = name))

Pivot wider with merging rows in R

Two-step:

tidyr::pivot_longer(dat, -c(id, group)) %>%
tidyr::pivot_wider(id, names_from=c("group", "name"),
names_glue="{name}_{group}", values_from="value")
# # A tibble: 4 x 9
# id A_2 B_2 C_2 D_2 A_3 B_3 C_3 D_3
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 id1 0 81 0.01 81 115 66 0.05 181
# 2 id2 0 58 0.01 58 0 110 0.01 110
# 3 id3 95 81 0.03 176 NA NA NA NA
# 4 id4 40 126 0.02 166 NA NA NA NA

though I suspect there's a way to do it with names_glue.

Or a one-step:

pivot_wider(dat, names_from = "group", values_from = A:D, values_fill = 0)
# # A tibble: 4 x 9
# id A_2 A_3 B_2 B_3 C_2 C_3 D_2 D_3
# <chr> <int> <int> <int> <int> <dbl> <dbl> <int> <int>
# 1 id1 0 115 81 66 0.01 0.05 81 181
# 2 id2 0 0 58 110 0.01 0.01 58 110
# 3 id3 95 0 81 0 0.03 0 176 0
# 4 id4 40 0 126 0 0.02 0 166 0

Neither are immediately filling out missing letter/number combinations.

Pivot all columns wider (except ID columns) using pivot_wider() in R/tidyverse

As we know the first 3 columns, should be fixed, use - on those column names in values_from

library(dplyr)
library(tidyr)
data_long %>%
pivot_wider(names_from = rater, values_from = -names(.)[1:3])

Or if we already create an object

id_cols <- c("ID", "time")
data_long %>%
pivot_wider(names_from = rater, values_from = -all_of(id_cols))

pivot wider while keeping the order

Here is one solution:

library(tidyverse)

order_id <- c(2,2,2,3,3,3)
product_name <- c("banana","Garlic Powder","carrots","milk","banana","Coconut Butter")

df <- data.frame(order_id,product_name) %>% group_by(order_id) %>%
mutate(id = paste0("item", row_number())) %>%
pivot_wider(id_cols = order_id, values_from = product_name, names_from = id)

How can I use purrr to pivot a nested dataframe?

We can use purrr::map() within dplyr::mutate():

library(tidyverse)

df <- tibble(
id_01 = c(rep("01", 3), rep("02", 3)),
a = (c("a", "a", "b", "c", "c", "d")),
b = letters[7:12],
id_02 = rep(c(1, 2, 1), 2)
)

nested_df <- df %>%
nest(data = -id_01) %>%
mutate(data = map(data, ~ .x %>%
pivot_wider(
id_cols = a,
names_from = id_02,
values_from = b
)))

nested_df
#> # A tibble: 2 x 2
#> id_01 data
#> <chr> <list>
#> 1 01 <tibble [2 x 3]>
#> 2 02 <tibble [2 x 3]>

nested_df %>%
unnest(data)
#> # A tibble: 4 x 4
#> id_01 a `1` `2`
#> <chr> <chr> <chr> <chr>
#> 1 01 a g h
#> 2 01 b i <NA>
#> 3 02 c j k
#> 4 02 d l <NA>

Created on 2021-03-26 by the reprex package (v1.0.0)

Is pivot_longer() and pivot_wider() transitive?

The short answer is no, there's nothing within pivot_* to address this. Might be a good idea to add something in the future to pivot_longer to generate unique rowIDs from within.

As you said, the workaround is to add such row IDs yourself, using tibble's rowid_to_column:

iris %>% rowid_to_column() %>% 
pivot_longer(-c(Species,rowid), names_to = "Measure", values_to = "Value") %>%
pivot_wider(names_from = "Measure", values_from = "Value")

this will bring you back to iris.



Related Topics



Leave a reply



Submit