Unnest a list column directly into several columns
with tidyr 1.0.0 you can do :
library(tidyr)
df1 <- tibble(
gr = c('a', 'b', 'c'),
values = list(1:2, 3:4, 5:6)
)
unnest_wider(df1, values)
#> New names:
#> * `` -> ...1
#> * `` -> ...2
#> New names:
#> * `` -> ...1
#> * `` -> ...2
#> New names:
#> * `` -> ...1
#> * `` -> ...2
#> # A tibble: 3 x 3
#> gr ...1 ...2
#> <chr> <int> <int>
#> 1 a 1 2
#> 2 b 3 4
#> 3 c 5 6
Created on 2019-09-14 by the reprex package (v0.3.0)
The output is verbose here because the elements that were unnested horizontally (the vector elements) were not named, and unnest_wider
doesn't want to guess silently.
We can name them beforehand to avoid it :
df1 %>%
dplyr::mutate(values = purrr::map(values, setNames, c("V1","V2"))) %>%
unnest_wider(values)
#> # A tibble: 3 x 3
#> gr V1 V2
#> <chr> <int> <int>
#> 1 a 1 2
#> 2 b 3 4
#> 3 c 5 6
Or just use suppressMessages()
or purrr::quietly()
Unlist/unnest list column into several columns
It is easier to do this rowSums
i.e. divide the 'product1' by the rowSums
on the columns that starts with key word 'product'. Instead of doing rowwise
with c_across
, this is vectorized and should be fast as well
library(dplyr)
dat %>%
mutate(sum_product = product1/rowSums(select(., starts_with('product'))))
NOTE: There is a mixing of base R
code (apply
) and the tidyverse option with across
which doesn't seem to be the optimal way
If we need to do this for all the 'product' columns, create a sum
column first with mutate
and then use across
on the columns that starts with 'product' to divide the column by 'Sum_col'
dat %>%
mutate(Sum_col = rowSums(select(., starts_with('product'))),
across(starts_with('product'),
~ ./Sum_col, .names = '{.col}_sum_product')) %>%
select(-Sum_col)
-output
#ysRespNum product1 product2 product3 product1_sum_product product2_sum_product product3_sum_product
#1 1 23.766555 13.46907 24.32327 0.3860783 0.2187998 0.3951219
#2 2 30.071773 15.98740 11.39922 0.5233660 0.2782431 0.1983909
#3 3 18.224328 11.03880 20.67063 0.3649701 0.2210688 0.4139610
#4 4 30.140839 19.78984 19.62087 0.4333597 0.2845348 0.2821054
#5 5 8.915628 30.75021 24.29150 0.1393996 0.4807925 0.3798079
#6 6 23.791981 11.14885 21.72450 0.4198684 0.1967490 0.3833826
Or using base R
nm1 <- startsWith(names(dat), 'product')
dat[paste0('sum_product', seq_along(nm1))] <- dat[nm1]/rowSums(dat[nm1])
How to unnest a single column from a nested list without unnesting all
You can use map()
to grab out that column and then unnest that.
nested %>%
mutate(qsec = map(extra, "qsec")) %>%
unnest_longer(qsec)
Notice though that this might create problems depending on what you do. This is because you'll now potentially have duplicates with the rest of the nested data. Might be safer to just unnest and nest again.
# A tibble: 32 x 6
mpg cyl disp hp extra qsec
<dbl> <dbl> <dbl> <dbl> <list> <dbl>
1 21 6 160 110 <tibble [2 x 7]> 16.5
2 21 6 160 110 <tibble [2 x 7]> 17.0
3 22.8 4 108 93 <tibble [1 x 7]> 18.6
4 21.4 6 258 110 <tibble [1 x 7]> 19.4
5 18.7 8 360 175 <tibble [1 x 7]> 17.0
6 18.1 6 225 105 <tibble [1 x 7]> 20.2
7 14.3 8 360 245 <tibble [1 x 7]> 15.8
8 24.4 4 147. 62 <tibble [1 x 7]> 20
9 22.8 4 141. 95 <tibble [1 x 7]> 22.9
10 19.2 6 168. 123 <tibble [1 x 7]> 18.3
Unnest one column list to many columns in tidyr
With dplyr
and purrr
df %>%
mutate(ctn = map(ctn, as_tibble)) %>%
unnest()
# A tibble: 2 x 3
id a b
<int> <chr> <dbl>
1 1 x 1
2 2 y 2
Unnest multiple columns
We may use where
library(dplyr)
library(tidyr)
library(purrr)
df %>%
mutate(mx = invoke(pmax, across(where(is.list), lengths))) %>%
mutate(across(where(is.list), ~ map2(.x, mx, ~ {
length(.x) <- .y
if(cur_column() == "left") .x <- .x[order(!is.na(.x))]
.x})), mx = NULL) %>%
unnest(where(is.list))
# A tibble: 8 × 4
go left node right
<chr> <chr> <chr> <chr>
1 go after it <NA> go after
2 here we go we go <NA>
3 he went bust he went bust
4 go get it go <NA> go get
5 go get it go it go <NA>
6 i 'm gon na go 'm gon na go
7 i 'm gon na go na go <NA>
8 she 's going berserk 's going berserk
update
Based on the comments from OP, previous solution works
df %>%
unnest(where(is.list))
If there are NULL
elements, specify keep_empty = TRUE
(in the OP's data, some of the elements were blank (""
) instead of NULL
, so the previous one should work as well
df %>%
unnest(where(is.list), keep_empty = TRUE)
# A tibble: 8 × 4
go left node right
<chr> <chr> <chr> <chr>
1 go after it "" go "after"
2 here we go "we" go ""
3 he went bust "he" went "bust"
4 go get it go "" go "get"
5 go get it go "it" go ""
6 i 'm gonna go "'m" gonna "go"
7 i 'm gonna go "gonna" go ""
How to unnest column-list?
You can do this by coercing the elements in the list column to data frames arranged as you like, which will unnest nicely:
library(tidyverse)
tibble(a = c('first', 'second'),
b = list(c('colA' = 1, 'colC' = 2), c('colA'= 3, 'colB'=2))) %>%
mutate(b = invoke_map(tibble, b)) %>%
unnest()
#> # A tibble: 2 x 4
#> a colA colC colB
#> <chr> <dbl> <dbl> <dbl>
#> 1 first 1. 2. NA
#> 2 second 3. NA 2.
Doing the coercion is a little tricky, though, as you don't want to end up with a 2x1 data frame. There are various ways around this, but a direct route is purrr::invoke_map
, which calls a function with purrr::invoke
(like do.call
) on each element in a list.
R unnest multiple columns
We can use cross_df
from purrr
:
purrr::cross_df(my_list)
# year period id
# <int> <dbl> <dbl>
#1 2018 1 17
#2 2019 1 17
#3 2020 1 17
#4 2018 1 35
#5 2019 1 35
#6 2020 1 35
Or in base R use expand.grid
with do.call
:
do.call(expand.grid, my_list)
Unnesting a list of lists in a data frame column
Note: Ignore the original and Update 1; Update 2 is better with the current state of the tidyverse.
Original:
With purrr
, which is nice for lists,
library(purrr)
df %>% dmap(unlist)
## # A tibble: 2 x 2
## x y
## <dbl> <dbl>
## 1 1 1
## 2 1 2
which is more or less equivalent to
as.data.frame(lapply(df, unlist))
## x y
## a 1 1
## b 1 2
Update 1:
dmap
has been deprecated and moved to purrrlyr, the home of interesting but ill-fated functions that will now shout lots of deprecation warnings at you. You could translate the base R idiom to tidyverse:
df %>% map(unlist) %>% as_tibble()
which will work fine for this case, but not for more than one row (a problem all these approaches face). A more robust solution might be
library(tidyverse)
df %>% bind_rows(df) %>% # make larger sample data
mutate_if(is.list, simplify_all) %>% # flatten each list element internally
unnest() # expand
#> # A tibble: 4 × 2
#> x y
#> <dbl> <dbl>
#> 1 1 1
#> 2 1 2
#> 3 1 1
#> 4 1 2
Update 2:
At some point since this was asked, tidyr::unnest()
got updated such that it doesn't error anymore, so you can just do
df %>%
unnest(y) %>%
unnest(y)
#> # A tibble: 2 × 2
#> x y
#> <dbl> <dbl>
#> 1 1 1
#> 2 1 2
If you care about the names in the list, pull them out first and then unnest the names and the list at the same time:
df %>%
mutate(label = map(y, names)) %>%
unnest(c(y, label)) %>%
unnest(y)
#> # A tibble: 2 × 3
#> x y label
#> <dbl> <dbl> <chr>
#> 1 1 1 a
#> 2 1 2 b
I'll leave the previous answers for continuity, but this is simpler.
How to unnest multiple list columns of a dataframe in one go with dplyr pipe
There's probably a cleaner way to do it, but if you want the cartesian product for the columns you can unnest them in sequence, if nothing else:
> df %>%
unnest(a, .drop = FALSE) %>%
unnest(b, .drop = FALSE)
# # A tibble: 7 x 3
# c a b
# <dbl> <chr> <chr>
# 1 11 a 1
# 2 11 a 2
# 3 11 a 3
# 4 11 b 1
# 5 11 b 2
# 6 11 b 3
# 7 22 c 3
Related Topics
Print String and Variable Contents on the Same Line in R
Plot Random Effects from Lmer (Lme4 Package) Using Qqmath or Dotplot: How to Make It Look Fancy
Get the Column Number in R Given the Column Name
Working with Dictionaries/Lists to Get List of Keys
How to Make a Matrix from a List of Vectors in R
Here We Go Again: Append an Element to a List in R
Reorder Rows Using Custom Order
What Is a Good Way to Read Line-By-Line in R
Can't Change Fonts in Ggplot/Geom_Text
Doing a Plyr Operation on Every Row of a Data Frame in R
How to Clean Up R Memory Without Restarting My Pc
Dplyr: Put Count Occurrences into New Variable
Create an Expression from a Function for Data.Table to Eval
Plotting Data from an Svm Fit - Hyperplane
Remove Data.Frame Row Names When Using Xtable
How to Merge Two Columns in R with a Specific Symbol
Replacing All Missing Values in R Data.Table with a Value
Using Expression(Paste( to Insert Math Notation into a Legend