Extract a Dplyr Tbl Column as a Vector

Extract a dplyr tbl column as a vector

With dplyr >= 0.7.0, you can use pull() to get a vector from a tbl.



library(dplyr, warn.conflicts = FALSE)
db <- src_sqlite(tempfile(), create = TRUE)
iris2 <- copy_to(db, iris)
vec <- pull(iris2, Species)
head(vec)
#> [1] "setosa" "setosa" "setosa" "setosa" "setosa" "setosa"

dplyr::select one column and output as vector

The best way to do it (IMO):

library(dplyr)
df <- data_frame(x = 1:10, y = LETTERS[1:10])

df %>%
filter(x > 5) %>%
.$y

In dplyr 0.7.0, you can now use pull():

df %>% filter(x > 5) %>% pull(y)

Extract a single dplyr tbl_df row as a vector

From Introduction to dplyr: "All of the dplyr functions take a data frame (or tibble) as the first argument." So no need to convert mtcars into a tibble. Furthermore, as.numeric() is more concise than unlist(., use.names = FALSE).

library(dplyr)
mtcars %>%
slice(2) %>%
as.numeric()

Conveniently extract a named vector from data.frame using dplyr/tidy?

The pull.data.frame method already accepts an argument for naming. I thought this was available previously, but this might be only in dplyr 1.0, in which case you would need to install from the tidyverse\dplyr Github repo.

iris %>%
arrange(Sepal.Length) %>%
pull(Sepal.Length, Species)

How to extract a vector in a tibble column to multiple columns in the same tibble?

We need to get the names of the 'cut' variable as new column and then do a spread to reshape to 'wide' format after unnesting the list elements

mtcars %>%
group_by(cyl) %>%
by_slice(~fun(.x$hp,.x$gear)) %>%
rename(cut=.out) %>%
mutate(Names = map(cut, ~factor(names(.x), levels = names(.x)))) %>%
unnest %>%
spread(Names, cut)
# A tibble: 3 x 7
# cyl `[50,100)` `[100,150)` `[150,200)` `[200,250)` `[250,300)` `[300,350)`
#* <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 4 36 9 NA NA NA NA
#2 6 NA 22 5 NA NA NA
#3 8 NA NA 21 15 5 5

Selecting a single column from a tibble still returns a tibble instead of a vector

try pull

sen <- df %>%
filter(my_dummy == 0) %>%
pull(col_name)

dplyr - get column values as character vector

This should work.

 as.character(df %>% filter(var == 'Mileage') %>% arrange(desc(value)) %>% pull(CAR_MODEL))

Result

[1] "Nissan Sunny"   "Suzuki Ciaz"    "Renault Duster" "Toyota Corolla"

Extract the single value from a 1 x 1 data.frame produced with dplyr as a vector?

You can get the vector using df[[1,1]]

Output

> df[[1,1]]
[1] 1

Here is a simple example that explains how it works using test data

df1 <- data.frame(a = c(1,2,3), b = c(4,5,6))

Output

> df1['a']
a
1 1
2 2
3 3
> df1[['a']]
[1] 1 2 3

correlation of a vector across all column in R (dplyr)

Since cor() requires same dimension for both x and y, you cannot group rows together, otherwise, they will not have 4 elements to match with 4 values in y.

Prepare data and library

library(dplyr)

gdf <-
tibble(g = c(1, 1, 2, 3), v1 = 10:13, v2 = 20:23)

y <- rnorm(4)
[1] 0.59390132 0.91897737 0.78213630 0.07456498

mutate()

If you want to keep v1 and v2 in the output, use the .names argument to indicate the names of the new columns. {.col} refers to the column name that across is acting on.

gdf %>% mutate(across(v1:v2, ~ cor(.x,y), .names = "{.col}_cor"))

# A tibble: 4 x 5
g v1 v2 v1_cor v2_cor
<dbl> <int> <int> <dbl> <dbl>
1 1 10 20 -0.591 -0.591
2 1 11 21 -0.591 -0.591
3 2 12 22 -0.591 -0.591
4 3 13 23 -0.591 -0.591

summarise()

If you only want the cor() output in the results, you can use summarise

gdf %>% summarize(across(v1:v2, ~ cor(.x,y)))

# A tibble: 1 x 2
v1 v2
<dbl> <dbl>
1 -0.591 -0.591


Related Topics



Leave a reply



Submit