Unnest a List Column Directly into Several Columns

Unnest a list column directly into several columns

with tidyr 1.0.0 you can do :

library(tidyr)
df1 <- tibble(
gr = c('a', 'b', 'c'),
values = list(1:2, 3:4, 5:6)
)

unnest_wider(df1, values)
#> New names:
#> * `` -> ...1
#> * `` -> ...2
#> New names:
#> * `` -> ...1
#> * `` -> ...2
#> New names:
#> * `` -> ...1
#> * `` -> ...2
#> # A tibble: 3 x 3
#> gr ...1 ...2
#> <chr> <int> <int>
#> 1 a 1 2
#> 2 b 3 4
#> 3 c 5 6

Created on 2019-09-14 by the reprex package (v0.3.0)

The output is verbose here because the elements that were unnested horizontally (the vector elements) were not named, and unnest_wider doesn't want to guess silently.

We can name them beforehand to avoid it :

df1 %>%
dplyr::mutate(values = purrr::map(values, setNames, c("V1","V2"))) %>%
unnest_wider(values)
#> # A tibble: 3 x 3
#> gr V1 V2
#> <chr> <int> <int>
#> 1 a 1 2
#> 2 b 3 4
#> 3 c 5 6

Or just use suppressMessages() or purrr::quietly()

Unlist/unnest list column into several columns

It is easier to do this rowSums i.e. divide the 'product1' by the rowSums on the columns that starts with key word 'product'. Instead of doing rowwise with c_across, this is vectorized and should be fast as well

library(dplyr)
dat %>%
mutate(sum_product = product1/rowSums(select(., starts_with('product'))))

NOTE: There is a mixing of base R code (apply) and the tidyverse option with across which doesn't seem to be the optimal way


If we need to do this for all the 'product' columns, create a sum column first with mutate and then use across on the columns that starts with 'product' to divide the column by 'Sum_col'

dat %>%
mutate(Sum_col = rowSums(select(., starts_with('product'))),
across(starts_with('product'),
~ ./Sum_col, .names = '{.col}_sum_product')) %>%
select(-Sum_col)

-output

#ysRespNum  product1 product2 product3 product1_sum_product product2_sum_product product3_sum_product
#1 1 23.766555 13.46907 24.32327 0.3860783 0.2187998 0.3951219
#2 2 30.071773 15.98740 11.39922 0.5233660 0.2782431 0.1983909
#3 3 18.224328 11.03880 20.67063 0.3649701 0.2210688 0.4139610
#4 4 30.140839 19.78984 19.62087 0.4333597 0.2845348 0.2821054
#5 5 8.915628 30.75021 24.29150 0.1393996 0.4807925 0.3798079
#6 6 23.791981 11.14885 21.72450 0.4198684 0.1967490 0.3833826

Or using base R

nm1 <- startsWith(names(dat), 'product')
dat[paste0('sum_product', seq_along(nm1))] <- dat[nm1]/rowSums(dat[nm1])

How to unnest a single column from a nested list without unnesting all

You can use map() to grab out that column and then unnest that.

nested %>% 
mutate(qsec = map(extra, "qsec")) %>%
unnest_longer(qsec)

Notice though that this might create problems depending on what you do. This is because you'll now potentially have duplicates with the rest of the nested data. Might be safer to just unnest and nest again.

# A tibble: 32 x 6
mpg cyl disp hp extra qsec
<dbl> <dbl> <dbl> <dbl> <list> <dbl>
1 21 6 160 110 <tibble [2 x 7]> 16.5
2 21 6 160 110 <tibble [2 x 7]> 17.0
3 22.8 4 108 93 <tibble [1 x 7]> 18.6
4 21.4 6 258 110 <tibble [1 x 7]> 19.4
5 18.7 8 360 175 <tibble [1 x 7]> 17.0
6 18.1 6 225 105 <tibble [1 x 7]> 20.2
7 14.3 8 360 245 <tibble [1 x 7]> 15.8
8 24.4 4 147. 62 <tibble [1 x 7]> 20
9 22.8 4 141. 95 <tibble [1 x 7]> 22.9
10 19.2 6 168. 123 <tibble [1 x 7]> 18.3

Unnest one column list to many columns in tidyr

With dplyr and purrr

df %>% 
mutate(ctn = map(ctn, as_tibble)) %>%
unnest()
# A tibble: 2 x 3
id a b
<int> <chr> <dbl>
1 1 x 1
2 2 y 2

Unnest multiple columns

We may use where

library(dplyr)
library(tidyr)
library(purrr)
df %>%
mutate(mx = invoke(pmax, across(where(is.list), lengths))) %>%
mutate(across(where(is.list), ~ map2(.x, mx, ~ {
length(.x) <- .y
if(cur_column() == "left") .x <- .x[order(!is.na(.x))]
.x})), mx = NULL) %>%
unnest(where(is.list))
# A tibble: 8 × 4
go left node right
<chr> <chr> <chr> <chr>
1 go after it <NA> go after
2 here we go we go <NA>
3 he went bust he went bust
4 go get it go <NA> go get
5 go get it go it go <NA>
6 i 'm gon na go 'm gon na go
7 i 'm gon na go na go <NA>
8 she 's going berserk 's going berserk

update

Based on the comments from OP, previous solution works

df %>%
unnest(where(is.list))

If there are NULL elements, specify keep_empty = TRUE (in the OP's data, some of the elements were blank ("") instead of NULL, so the previous one should work as well

df %>%
unnest(where(is.list), keep_empty = TRUE)
# A tibble: 8 × 4
go left node right
<chr> <chr> <chr> <chr>
1 go after it "" go "after"
2 here we go "we" go ""
3 he went bust "he" went "bust"
4 go get it go "" go "get"
5 go get it go "it" go ""
6 i 'm gonna go "'m" gonna "go"
7 i 'm gonna go "gonna" go ""

How to unnest column-list?

You can do this by coercing the elements in the list column to data frames arranged as you like, which will unnest nicely:

library(tidyverse)

tibble(a = c('first', 'second'),
b = list(c('colA' = 1, 'colC' = 2), c('colA'= 3, 'colB'=2))) %>%
mutate(b = invoke_map(tibble, b)) %>%
unnest()
#> # A tibble: 2 x 4
#> a colA colC colB
#> <chr> <dbl> <dbl> <dbl>
#> 1 first 1. 2. NA
#> 2 second 3. NA 2.

Doing the coercion is a little tricky, though, as you don't want to end up with a 2x1 data frame. There are various ways around this, but a direct route is purrr::invoke_map, which calls a function with purrr::invoke (like do.call) on each element in a list.

R unnest multiple columns

We can use cross_df from purrr :

purrr::cross_df(my_list)

# year period id
# <int> <dbl> <dbl>
#1 2018 1 17
#2 2019 1 17
#3 2020 1 17
#4 2018 1 35
#5 2019 1 35
#6 2020 1 35

Or in base R use expand.grid with do.call :

do.call(expand.grid, my_list)

Unnesting a list of lists in a data frame column

Note: Ignore the original and Update 1; Update 2 is better with the current state of the tidyverse.


Original:

With purrr, which is nice for lists,

library(purrr)

df %>% dmap(unlist)
## # A tibble: 2 x 2
## x y
## <dbl> <dbl>
## 1 1 1
## 2 1 2

which is more or less equivalent to

as.data.frame(lapply(df, unlist))
## x y
## a 1 1
## b 1 2

Update 1:

dmap has been deprecated and moved to purrrlyr, the home of interesting but ill-fated functions that will now shout lots of deprecation warnings at you. You could translate the base R idiom to tidyverse:

df %>% map(unlist) %>% as_tibble()

which will work fine for this case, but not for more than one row (a problem all these approaches face). A more robust solution might be

library(tidyverse)

df %>% bind_rows(df) %>% # make larger sample data
mutate_if(is.list, simplify_all) %>% # flatten each list element internally
unnest() # expand
#> # A tibble: 4 × 2
#> x y
#> <dbl> <dbl>
#> 1 1 1
#> 2 1 2
#> 3 1 1
#> 4 1 2

Update 2:

At some point since this was asked, tidyr::unnest() got updated such that it doesn't error anymore, so you can just do

df %>%
unnest(y) %>%
unnest(y)
#> # A tibble: 2 × 2
#> x y
#> <dbl> <dbl>
#> 1 1 1
#> 2 1 2

If you care about the names in the list, pull them out first and then unnest the names and the list at the same time:

df %>%
mutate(label = map(y, names)) %>%
unnest(c(y, label)) %>%
unnest(y)
#> # A tibble: 2 × 3
#> x y label
#> <dbl> <dbl> <chr>
#> 1 1 1 a
#> 2 1 2 b

I'll leave the previous answers for continuity, but this is simpler.

How to unnest multiple list columns of a dataframe in one go with dplyr pipe

There's probably a cleaner way to do it, but if you want the cartesian product for the columns you can unnest them in sequence, if nothing else:

> df %>% 
unnest(a, .drop = FALSE) %>%
unnest(b, .drop = FALSE)

# # A tibble: 7 x 3
# c a b
# <dbl> <chr> <chr>
# 1 11 a 1
# 2 11 a 2
# 3 11 a 3
# 4 11 b 1
# 5 11 b 2
# 6 11 b 3
# 7 22 c 3


Related Topics



Leave a reply



Submit