tidyverse - prefered way to turn a named vector into a data.frame/tibble
This is now directly supported using bind_rows
(introduced in dplyr 0.7.0
):
library(tidyverse))
vec <- c("a" = 1, "b" = 2)
bind_rows(vec)
#> # A tibble: 1 x 2
#> a b
#> <dbl> <dbl>
#> 1 1 2
This quote from https://cran.r-project.org/web/packages/dplyr/news.html explains the change:
bind_rows()
andbind_cols()
now accept vectors. They are treated as rows by the former and columns by the latter. Rows require inner names likec(col1 = 1, col2 = 2)
, while columns require outer names:col1 = c(1, 2)
. Lists are still treated as data frames but can be spliced explicitly with!!!
, e.g.bind_rows(!!! x)
(#1676).
With this change, it means that the following line in the use case example:
txt %>% map(read_xml) %>% map(xml_attrs) %>% map_df(~t(.) %>% as_tibble)
can be rewritten as
txt %>% map(read_xml) %>% map(xml_attrs) %>% map_df(bind_rows)
which is also equivalent to
txt %>% map(read_xml) %>% map(xml_attrs) %>% { bind_rows(!!! .) }
The equivalence of the different approaches is demonstrated in the following example:
library(tidyverse)
library(rvest)
txt <- c('<node a="1" b="2"></node>',
'<node a="1" c="3"></node>')
temp <- txt %>% map(read_xml) %>% map(xml_attrs)
# x, y, and z are identical
x <- temp %>% map_df(~t(.) %>% as_tibble)
y <- temp %>% map_df(bind_rows)
z <- bind_rows(!!! temp)
identical(x, y)
#> [1] TRUE
identical(y, z)
#> [1] TRUE
z
#> # A tibble: 2 x 3
#> a b c
#> <chr> <chr> <chr>
#> 1 1 2 <NA>
#> 2 1 <NA> 3
converting a vector into a dataframe columnwise
You can transpose the vector and convert it into dataframe/tibble.
t(x) %>% as_tibble()
t(x) %>% data.frame()
# estimate ci.low ci.up
#1 0.595 0.11 2.004
Convert Named Character Vector to data.frame
It's as simple as data.frame(as.list(testVect))
. Or if you want sensible data types for your columns, data.frame(lapply(testVect, type.convert), stringsAsFactors=FALSE)
.
Converting grouped tibble to named list
We can use split
with(my_data, split(list_values,
factor(list_names, levels = unique(list_names))))
$Ford
[1] "Ranger" "F150" "Explorer"
$Chevy
[1] "Equinox"
$Dodge
[1] "Caravan" "Ram"
Or with unstack
unstack(my_data, list_values ~ list_names)
$Chevy
[1] "Equinox"
$Dodge
[1] "Caravan" "Ram"
$Ford
[1] "Ranger" "F150" "Explorer"
Create empty tibble/data frame with column names coming from a vector
You can create a named vector, vec
, where the first argument sets the type of column you want. The rep("", 3)
line says I want three character columns. Then the second argument is the vector of column names.
Use dplyr::bind_rows
to convert this into tibble with one row. Then [0, ]
selects zero rows, leaving it empty.
With this method, you can control the data type for each column easily.
library(dplyr)
vec <- setNames(rep("", 3), letters[1:3])
bind_rows(vec)[0, ]
# A tibble: 0 x 3
# ... with 3 variables: a <chr>, b <chr>, c <chr>
You can also use as_tibble
if you transpose the named vector. I guess I use bind_rows
because I usually have dplyr
loaded but not tibble
.
library(tibble)
vec <- setNames(rep("", 3), letters[1:3])
as_tibble(t(vec))[0, ]
# A tibble: 0 x 3
# ... with 3 variables: a <chr>, b <chr>, c <chr>
If you know all of the columns are of a single type (e.g., character), you can do something like this.
vec <- letters[1:3]
df <- bind_rows(setNames(rep("", length(vec)), vec))[0, ]
How to create a one-row data frame from a vector in R?
We could use unnest_wider
after returning the output in a list
in summarise
library(dplyr)
library(tidyr)
mtcars %>%
group_by(cyl) %>%
summarise(out = list(boxplot.stats(wt)$stats)) %>%
unnest_wider(out) %>%
rename_at(-1, ~ str_replace(., '\\.+', 'x'))
# A tibble: 3 x 6
# cyl x1 x2 x3 x4 x5
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 1.51 1.88 2.2 2.62 3.19
#2 6 2.62 2.82 3.22 3.44 3.46
#3 8 3.17 3.52 3.76 4.07 4.07
Or if we want to use the OP's method, then set the names for that vector
and use as_tibble_row
library(purrr)
library(stringr)
mtcars %>%
group_by(cyl) %>%
group_map(~ tibble(cyl = first(.x$cyl),
setNames(boxplot.stats(.$wt)$stats, str_c('x', 1:5)) %>%
as_tibble_row) , .keep = TRUE) %>%
bind_rows
# A tibble: 3 x 6
# cyl x1 x2 x3 x4 x5
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 1.51 1.88 2.2 2.62 3.19
#2 6 2.62 2.82 3.22 3.44 3.46
#3 8 3.17 3.52 3.76 4.07 4.07
As the output of group_map
is always a list
, it may be better to use group_modify
to return a tbl
thus avoiding the last map_dfr/bind_rows
mtcars %>%
group_by(cyl) %>%
group_modify(~ setNames(boxplot.stats(.$wt)$stats, str_c('x', 1:5)) %>%
as_tibble_row , .keep = TRUE) %>%
ungroup
# A tibble: 3 x 6
# cyl x1 x2 x3 x4 x5
# <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 1.51 1.88 2.2 2.62 3.19
#2 6 2.62 2.82 3.22 3.44 3.46
#3 8 3.17 3.52 3.76 4.07 4.07
What is the tidyverse way to apply a function designed to take data.frames as input across a grouped tibble in R?
We can use the OP's function in group_modify
library(dplyr)
MyDF %>%
group_by(Fruit) %>%
group_modify(~ .x %>%
summarise(MyVal = myFun(.x))) %>%
ungroup
-output
# A tibble: 2 × 2
Fruit MyVal
<chr> <int>
1 Apple 42925
2 Mango 295425
Or in group_map
where the .y
is the grouping column
MyDF %>%
group_by(Fruit) %>%
group_map(~ bind_cols(.y, MyVal = myFun(.))) %>%
bind_rows
# A tibble: 2 × 2
Fruit MyVal
<chr> <int>
1 Apple 42925
2 Mango 295425
Does the pipe opperator turn a data frame into a tibble? If so, how can I prevent it?
I think you just need to convert the object to data.frame
before you can use renames_to_column
:
gss_cat %>%
group_by(race, partyid) %>%
summarise(Freq = n()) %>%
pivot_wider(names_from = race, values_from = Freq) %>%
drop_na() %>%
column_to_rownames("partyid") %>%
chisq.test(.) %>%
`[[`("residuals") %>%
as.data.frame() %>%
rownames_to_column("partyid")
# partyid Other Black White
# 1 No answer 2.923648 2.8649510 -2.262282
# 2 Other party -2.311635 -4.6581430 2.834128
# 3 Strong republican -8.950509 -15.3086303 9.782022
# 4 Not str republican -7.246500 -16.8275514 9.856563
# 5 Ind,near rep -3.546661 -10.4554526 5.793770
# 6 Independent 12.196368 -4.4484394 -2.272619
# 7 Ind,near dem 3.783087 -0.6287861 -1.033036
# 8 Not str democrat 5.478600 8.9946078 -5.823392
# 9 Strong democrat -5.339841 32.7172447 -12.447571
Convert any number of vectors into a dataframe whilst preserving data types and using vector names as column names in R
This works fine with data.frame
. You just need to add the argument, stringsAsFactors=FALSE
.
df = data.frame(var_a, var_b, var_c, stringsAsFactors = FALSE)
sapply(df, class)
var_a var_b var_c
"character" "numeric" "factor"
Replacement of column values based on a named vector
You could use col
:
df$col1 <- vec[as.character(df$col)]
Or in mutate
:
library(dplyr)
df %>% mutate(col1 = vec[as.character(col)])
# col col1
# <int> <chr>
# 1 1 a
# 2 1 a
# 3 1 a
# 4 1 a
# 5 2 b
# 6 2 b
# 7 3 c
# 8 3 c
# 9 3 c
#10 3 c
#11 3 c
Related Topics
Read Multiple Xlsx Files with Multiple Sheets into One R Data Frame
Change Plotly Chart Y Variable Based on Selectinput
Dplyr::Do() Requires Named Function
Ggplot2 Draw Individual Ellipses But Color by Group
R Data.Table Breaks in Exported Functions
How to Read Specific Rows of CSV File with Fread Function
Documentation on Internal Variables in Ggplot, Esp. Panel
Why Does Rm Inside a Function Not Delete Objects
Plotting a "Sequence Logo" Using Ggplot2
Looping Through List of Data Frames in R
Displaying True When Shiny Files Are Split into Different Folders
How to Use Loess Method in Ggally::Ggpairs Using Wrap Function
Set a Functions Environment to That of the Calling Environment (Parent.Frame) from Within Function
Si Prefixes in Ggplot2 Axis Labels