Unnest a list column directly into several columns
with tidyr 1.0.0 you can do :
library(tidyr)
df1 <- tibble(
gr = c('a', 'b', 'c'),
values = list(1:2, 3:4, 5:6)
)
unnest_wider(df1, values)
#> New names:
#> * `` -> ...1
#> * `` -> ...2
#> New names:
#> * `` -> ...1
#> * `` -> ...2
#> New names:
#> * `` -> ...1
#> * `` -> ...2
#> # A tibble: 3 x 3
#> gr ...1 ...2
#> <chr> <int> <int>
#> 1 a 1 2
#> 2 b 3 4
#> 3 c 5 6
Created on 2019-09-14 by the reprex package (v0.3.0)
The output is verbose here because the elements that were unnested horizontally (the vector elements) were not named, and unnest_wider
doesn't want to guess silently.
We can name them beforehand to avoid it :
df1 %>%
dplyr::mutate(values = purrr::map(values, setNames, c("V1","V2"))) %>%
unnest_wider(values)
#> # A tibble: 3 x 3
#> gr V1 V2
#> <chr> <int> <int>
#> 1 a 1 2
#> 2 b 3 4
#> 3 c 5 6
Or just use suppressMessages()
or purrr::quietly()
How to unnest column-list?
You can do this by coercing the elements in the list column to data frames arranged as you like, which will unnest nicely:
library(tidyverse)
tibble(a = c('first', 'second'),
b = list(c('colA' = 1, 'colC' = 2), c('colA'= 3, 'colB'=2))) %>%
mutate(b = invoke_map(tibble, b)) %>%
unnest()
#> # A tibble: 2 x 4
#> a colA colC colB
#> <chr> <dbl> <dbl> <dbl>
#> 1 first 1. 2. NA
#> 2 second 3. NA 2.
Doing the coercion is a little tricky, though, as you don't want to end up with a 2x1 data frame. There are various ways around this, but a direct route is purrr::invoke_map
, which calls a function with purrr::invoke
(like do.call
) on each element in a list.
How to unnest a single column from a nested list without unnesting all
You can use map()
to grab out that column and then unnest that.
nested %>%
mutate(qsec = map(extra, "qsec")) %>%
unnest_longer(qsec)
Notice though that this might create problems depending on what you do. This is because you'll now potentially have duplicates with the rest of the nested data. Might be safer to just unnest and nest again.
# A tibble: 32 x 6
mpg cyl disp hp extra qsec
<dbl> <dbl> <dbl> <dbl> <list> <dbl>
1 21 6 160 110 <tibble [2 x 7]> 16.5
2 21 6 160 110 <tibble [2 x 7]> 17.0
3 22.8 4 108 93 <tibble [1 x 7]> 18.6
4 21.4 6 258 110 <tibble [1 x 7]> 19.4
5 18.7 8 360 175 <tibble [1 x 7]> 17.0
6 18.1 6 225 105 <tibble [1 x 7]> 20.2
7 14.3 8 360 245 <tibble [1 x 7]> 15.8
8 24.4 4 147. 62 <tibble [1 x 7]> 20
9 22.8 4 141. 95 <tibble [1 x 7]> 22.9
10 19.2 6 168. 123 <tibble [1 x 7]> 18.3
Unnesting a list of lists in a data frame column
Note: Ignore the original and Update 1; Update 2 is better with the current state of the tidyverse.
Original:
With purrr
, which is nice for lists,
library(purrr)
df %>% dmap(unlist)
## # A tibble: 2 x 2
## x y
## <dbl> <dbl>
## 1 1 1
## 2 1 2
which is more or less equivalent to
as.data.frame(lapply(df, unlist))
## x y
## a 1 1
## b 1 2
Update 1:
dmap
has been deprecated and moved to purrrlyr, the home of interesting but ill-fated functions that will now shout lots of deprecation warnings at you. You could translate the base R idiom to tidyverse:
df %>% map(unlist) %>% as_tibble()
which will work fine for this case, but not for more than one row (a problem all these approaches face). A more robust solution might be
library(tidyverse)
df %>% bind_rows(df) %>% # make larger sample data
mutate_if(is.list, simplify_all) %>% # flatten each list element internally
unnest() # expand
#> # A tibble: 4 × 2
#> x y
#> <dbl> <dbl>
#> 1 1 1
#> 2 1 2
#> 3 1 1
#> 4 1 2
Update 2:
At some point since this was asked, tidyr::unnest()
got updated such that it doesn't error anymore, so you can just do
df %>%
unnest(y) %>%
unnest(y)
#> # A tibble: 2 × 2
#> x y
#> <dbl> <dbl>
#> 1 1 1
#> 2 1 2
If you care about the names in the list, pull them out first and then unnest the names and the list at the same time:
df %>%
mutate(label = map(y, names)) %>%
unnest(c(y, label)) %>%
unnest(y)
#> # A tibble: 2 × 3
#> x y label
#> <dbl> <dbl> <chr>
#> 1 1 1 a
#> 2 1 2 b
I'll leave the previous answers for continuity, but this is simpler.
Unnest one of several list columns in dataframe
According to unnest
, the argument ...
Specification of columns to nest. Use bare variable names or
functions of variables. If omitted, defaults to all list-cols.
Therefore, we could specify the column name to be unnest
ed after the rename_all
iris %>
... #op's code
...
rename_all(funs(str_c("Mean.", .))))) %>%
unnest(sum_data)
# A tibble: 3 x 6
# Species data Mean.Sepal.Length Mean.Sepal.Width Mean.Petal.Length Mean.Petal.Width
# <fctr> <list> <dbl> <dbl> <dbl> <dbl>
#1 setosa <tibble [50 x 4]> 5.01 3.43 1.46 0.246
#2 versicolor <tibble [50 x 4]> 5.94 2.77 4.26 1.33
#3 virginica <tibble [50 x 4]> 6.59 2.97 5.55 2.03
Unlist/unnest list column into several columns
It is easier to do this rowSums
i.e. divide the 'product1' by the rowSums
on the columns that starts with key word 'product'. Instead of doing rowwise
with c_across
, this is vectorized and should be fast as well
library(dplyr)
dat %>%
mutate(sum_product = product1/rowSums(select(., starts_with('product'))))
NOTE: There is a mixing of base R
code (apply
) and the tidyverse option with across
which doesn't seem to be the optimal way
If we need to do this for all the 'product' columns, create a sum
column first with mutate
and then use across
on the columns that starts with 'product' to divide the column by 'Sum_col'
dat %>%
mutate(Sum_col = rowSums(select(., starts_with('product'))),
across(starts_with('product'),
~ ./Sum_col, .names = '{.col}_sum_product')) %>%
select(-Sum_col)
-output
#ysRespNum product1 product2 product3 product1_sum_product product2_sum_product product3_sum_product
#1 1 23.766555 13.46907 24.32327 0.3860783 0.2187998 0.3951219
#2 2 30.071773 15.98740 11.39922 0.5233660 0.2782431 0.1983909
#3 3 18.224328 11.03880 20.67063 0.3649701 0.2210688 0.4139610
#4 4 30.140839 19.78984 19.62087 0.4333597 0.2845348 0.2821054
#5 5 8.915628 30.75021 24.29150 0.1393996 0.4807925 0.3798079
#6 6 23.791981 11.14885 21.72450 0.4198684 0.1967490 0.3833826
Or using base R
nm1 <- startsWith(names(dat), 'product')
dat[paste0('sum_product', seq_along(nm1))] <- dat[nm1]/rowSums(dat[nm1])
Unnest multiple columns-list from tibble with tidyr unnes_wider
We could use unlist2d
with map
library(purrr)
library(collapse)
map_dfc(my_tibble, ~ unlist2d(.x) %>%
select(-1)) %>%
set_names(paste0(rep(names(my_tibble), each = 2), "_", 1:2))
-output
# Level_1 Level_2 Level2_1 Level2_2 Level3_1 Level3_2
#1 10 20 30 40 330 430
#2 10 20 50 20 530 33
Or loop over the names of the data in map
and apply unnest_wider
library(tidyr)
map_dfc(names(my_tibble),
~ my_tibble %>%
select(.x) %>%
unnest_wider(.x)) %>%
set_names(paste0(rep(names(my_tibble), each = 2), "_", 1:2))
According to documentation from ?unnest_wider
.col, col -
List-column to extract components from.
So, it is just a single column that we can specify where as in ?unnest
, it is cols
cols - Columns to unnest.
Related Topics
Ggplot2: Fill Color Behaviour of Geom_Ribbon
How to Create a Vector of Functions
How to Plot Pie Charts in Haplonet Haplotype Networks {Pegas}
Rbindlist Two Data.Tables Where One Has Factor and Other Has Character Type for a Column
How to Draw Half-Filled Points in R (Preferably Using Ggplot)
Applying Gsub to Various Columns
Order Categorical Data in a Stacked Bar Plot with Ggplot2
Boxplot of Table Using Ggplot2
Draw Multiple Squares with Ggplot
How Does R's Ifelse Work with Character Data
Group_By() into Fill() Not Working as Expected
Retain Attributes When Using Gather from Tidyr (Attributes Are Not Identical)
Removing Traces by Name Using Plotlyproxy (Or Accessing Output Schema in Reactive Context)
Solving a System of Nonlinear Equations in R
User Defined Colour Palette in R and Ggpairs
R Shiny - Uioutput Not Rendering Inside Menuitem
Efficient Multiplication of Columns in a Data Frame
Global Variable in a Package - Which Approach Is More Recommended