Tidyverse - Prefered Way to Turn a Named Vector into a Data.Frame/Tibble

tidyverse - prefered way to turn a named vector into a data.frame/tibble

This is now directly supported using bind_rows (introduced in dplyr 0.7.0):


library(tidyverse)) 
vec <- c("a" = 1, "b" = 2)

bind_rows(vec)
#> # A tibble: 1 x 2
#> a b
#> <dbl> <dbl>
#> 1 1 2

This quote from https://cran.r-project.org/web/packages/dplyr/news.html explains the change:

bind_rows() and bind_cols() now accept vectors. They are treated as rows by the former and columns by the latter. Rows require inner names like c(col1 = 1, col2 = 2), while columns require outer names: col1 = c(1, 2). Lists are still treated as data frames but can be spliced explicitly with !!!, e.g. bind_rows(!!! x) (#1676).

With this change, it means that the following line in the use case example:

txt %>% map(read_xml) %>% map(xml_attrs) %>% map_df(~t(.) %>% as_tibble)

can be rewritten as

txt %>% map(read_xml) %>% map(xml_attrs) %>% map_df(bind_rows)

which is also equivalent to

txt %>% map(read_xml) %>% map(xml_attrs) %>% { bind_rows(!!! .) }

The equivalence of the different approaches is demonstrated in the following example:


library(tidyverse)
library(rvest)

txt <- c('<node a="1" b="2"></node>',
'<node a="1" c="3"></node>')

temp <- txt %>% map(read_xml) %>% map(xml_attrs)

# x, y, and z are identical
x <- temp %>% map_df(~t(.) %>% as_tibble)
y <- temp %>% map_df(bind_rows)
z <- bind_rows(!!! temp)

identical(x, y)
#> [1] TRUE
identical(y, z)
#> [1] TRUE

z
#> # A tibble: 2 x 3
#> a b c
#> <chr> <chr> <chr>
#> 1 1 2 <NA>
#> 2 1 <NA> 3

converting a vector into a dataframe columnwise

You can transpose the vector and convert it into dataframe/tibble.

t(x) %>% as_tibble()
t(x) %>% data.frame()

# estimate ci.low ci.up
#1 0.595 0.11 2.004

Convert Named Character Vector to data.frame

It's as simple as data.frame(as.list(testVect)). Or if you want sensible data types for your columns, data.frame(lapply(testVect, type.convert), stringsAsFactors=FALSE).

Converting grouped tibble to named list

We can use split

with(my_data, split(list_values,
factor(list_names, levels = unique(list_names))))
$Ford
[1] "Ranger" "F150" "Explorer"

$Chevy
[1] "Equinox"

$Dodge
[1] "Caravan" "Ram"

Or with unstack

unstack(my_data, list_values ~ list_names)
$Chevy
[1] "Equinox"

$Dodge
[1] "Caravan" "Ram"

$Ford
[1] "Ranger" "F150" "Explorer"

Create empty tibble/data frame with column names coming from a vector

You can create a named vector, vec, where the first argument sets the type of column you want. The rep("", 3) line says I want three character columns. Then the second argument is the vector of column names.

Use dplyr::bind_rows to convert this into tibble with one row. Then [0, ] selects zero rows, leaving it empty.

With this method, you can control the data type for each column easily.

library(dplyr)

vec <- setNames(rep("", 3), letters[1:3])
bind_rows(vec)[0, ]

# A tibble: 0 x 3
# ... with 3 variables: a <chr>, b <chr>, c <chr>

You can also use as_tibble if you transpose the named vector. I guess I use bind_rows because I usually have dplyr loaded but not tibble.

library(tibble)

vec <- setNames(rep("", 3), letters[1:3])
as_tibble(t(vec))[0, ]

# A tibble: 0 x 3
# ... with 3 variables: a <chr>, b <chr>, c <chr>

If you know all of the columns are of a single type (e.g., character), you can do something like this.

vec <- letters[1:3]
df <- bind_rows(setNames(rep("", length(vec)), vec))[0, ]

How to create a one-row data frame from a vector in R?

We could use unnest_wider after returning the output in a list in summarise

library(dplyr)
library(tidyr)
mtcars %>%
group_by(cyl) %>%
summarise(out = list(boxplot.stats(wt)$stats)) %>%
unnest_wider(out) %>%
rename_at(-1, ~ str_replace(., '\\.+', 'x'))
# A tibble: 3 x 6
# cyl x1 x2 x3 x4 x5
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 1.51 1.88 2.2 2.62 3.19
#2 6 2.62 2.82 3.22 3.44 3.46
#3 8 3.17 3.52 3.76 4.07 4.07

Or if we want to use the OP's method, then set the names for that vector and use as_tibble_row

library(purrr)
library(stringr)
mtcars %>%
group_by(cyl) %>%
group_map(~ tibble(cyl = first(.x$cyl),
setNames(boxplot.stats(.$wt)$stats, str_c('x', 1:5)) %>%
as_tibble_row) , .keep = TRUE) %>%
bind_rows
# A tibble: 3 x 6
# cyl x1 x2 x3 x4 x5
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 1.51 1.88 2.2 2.62 3.19
#2 6 2.62 2.82 3.22 3.44 3.46
#3 8 3.17 3.52 3.76 4.07 4.07

As the output of group_map is always a list, it may be better to use group_modify to return a tbl thus avoiding the last map_dfr/bind_rows

mtcars %>% 
group_by(cyl) %>%
group_modify(~ setNames(boxplot.stats(.$wt)$stats, str_c('x', 1:5)) %>%
as_tibble_row , .keep = TRUE) %>%
ungroup
# A tibble: 3 x 6
# cyl x1 x2 x3 x4 x5
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 1.51 1.88 2.2 2.62 3.19
#2 6 2.62 2.82 3.22 3.44 3.46
#3 8 3.17 3.52 3.76 4.07 4.07

What is the tidyverse way to apply a function designed to take data.frames as input across a grouped tibble in R?

We can use the OP's function in group_modify

library(dplyr)
MyDF %>%
group_by(Fruit) %>%
group_modify(~ .x %>%
summarise(MyVal = myFun(.x))) %>%
ungroup

-output

# A tibble: 2 × 2
Fruit MyVal
<chr> <int>
1 Apple 42925
2 Mango 295425

Or in group_map where the .y is the grouping column

MyDF %>% 
group_by(Fruit) %>%
group_map(~ bind_cols(.y, MyVal = myFun(.))) %>%
bind_rows
# A tibble: 2 × 2
Fruit MyVal
<chr> <int>
1 Apple 42925
2 Mango 295425

Does the pipe opperator turn a data frame into a tibble? If so, how can I prevent it?

I think you just need to convert the object to data.frame before you can use renames_to_column:

gss_cat %>%
group_by(race, partyid) %>%
summarise(Freq = n()) %>%
pivot_wider(names_from = race, values_from = Freq) %>%
drop_na() %>%
column_to_rownames("partyid") %>%
chisq.test(.) %>%
`[[`("residuals") %>%
as.data.frame() %>%
rownames_to_column("partyid")
# partyid Other Black White
# 1 No answer 2.923648 2.8649510 -2.262282
# 2 Other party -2.311635 -4.6581430 2.834128
# 3 Strong republican -8.950509 -15.3086303 9.782022
# 4 Not str republican -7.246500 -16.8275514 9.856563
# 5 Ind,near rep -3.546661 -10.4554526 5.793770
# 6 Independent 12.196368 -4.4484394 -2.272619
# 7 Ind,near dem 3.783087 -0.6287861 -1.033036
# 8 Not str democrat 5.478600 8.9946078 -5.823392
# 9 Strong democrat -5.339841 32.7172447 -12.447571

Convert any number of vectors into a dataframe whilst preserving data types and using vector names as column names in R

This works fine with data.frame. You just need to add the argument, stringsAsFactors=FALSE.

df = data.frame(var_a, var_b, var_c, stringsAsFactors = FALSE)
sapply(df, class)
var_a var_b var_c
"character" "numeric" "factor"

Replacement of column values based on a named vector

You could use col :

df$col1 <- vec[as.character(df$col)]

Or in mutate :

library(dplyr)
df %>% mutate(col1 = vec[as.character(col)])
# col col1
# <int> <chr>
# 1 1 a
# 2 1 a
# 3 1 a
# 4 1 a
# 5 2 b
# 6 2 b
# 7 3 c
# 8 3 c
# 9 3 c
#10 3 c
#11 3 c


Related Topics



Leave a reply



Submit