Add New Variable to List of Data Frames with Purrr and Mutate() from Dplyr

Add new variable to list of data frames with purrr and mutate() from dplyr

Your issue is that you have to explicitly provide reference to the data when you're not using mutate with piping. To do this, I'd suggest using map2_df

dff <- map2_df(comentarios, names(comentarios), ~ mutate(.x, ID = .y)) 

Dataframe name to column in list of dataframes using purrr

Another way is to use lst instead of list which automatically names the list for you with imap which uses these names directly (.y).

library(tidyverse)
my_list <- lst(batch_1, batch_2, batch_3)
purrr::imap(my_list, ~mutate(.x, batch = .y))

# $batch_1
# A B batch
# 1 1 4 batch_1
# 2 2 5 batch_1
# 3 3 6 batch_1

# $batch_2
# A B batch
# 1 1 4 batch_2
# 2 2 5 batch_2
# 3 3 6 batch_2

# $batch_3
# A B batch
# 1 1 4 batch_3
# 2 2 5 batch_3
# 3 3 6 batch_3

Map over list of dataframes and apply custom mutate-function (purrr, dplyr)

You can also use the following solution.

  • First we have to define a function that takes a data set and a number of arguments. We explicitly use data argument for our data set and capture all the other arguments through ...
  • WE then use enquos function which returns a list of quoted function to defuse the expression we captured through ... and force evaluate it by big bang operator !!! which is normally used for splicing a list of arguments in the context of our data set data through tidy_eval function
  • We then iterate over each element of the list and apply our function on each and every one of them while evaluating our desired expression
library(rlang)

fn <- function(data, ...) {
args <- enquos(...)

data %>%
mutate(out = eval_tidy(!!!args, data = data))
}

df %>%
map_dfr(~ .x %>% fn(tp / (tp + fn)))

# A tibble: 11 x 5
fn fp tn tp out
<int> <int> <int> <int> <dbl>
1 0 34 0 34 1
2 1 26 8 33 0.971
3 3 22 12 31 0.912
4 5 7 27 29 0.853
5 5 3 31 29 0.853
6 7 1 33 27 0.794
7 8 0 34 26 0.765
8 8 0 34 26 0.765
9 8 0 34 26 0.765
10 30 0 34 4 0.118
11 34 0 34 0 0

How to mutate a list-column using purrr::map() to store a recipe object created via recipe()?

You've created funky structure the way you are nesting. You have put a dataframe as a column and then nested it, so pulling it, you actually just still have this strange 90x1 data frame column.

tibble(subset_training = data_set_training) %>%
nest(subset_training = subset_training) %>%
pull(subset_training) %>%
first()
#> # A tibble: 90 × 1
#> subset_training$Sepal.Length $Sepal.Width $Petal.Length $Petal.Width $Species
#> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 6.3 3.3 6 2.5 virgini…
#> 2 6 2.2 4 1 versico…
#> 3 5.7 2.8 4.5 1.3 versico…
#> 4 7.2 3.6 6.1 2.5 virgini…
#> 5 5 3.5 1.3 0.3 setosa
#> 6 5.1 3.8 1.6 0.2 setosa
#> 7 7.2 3.2 6 1.8 virgini…
#> 8 5.7 4.4 1.5 0.4 setosa
#> 9 4.4 2.9 1.4 0.2 setosa
#> 10 5.2 3.4 1.4 0.2 setosa
#> # … with 80 more rows

Here's how you should be nesting it.

data_set_training %>%
nest(subset_training = everything()) %>%
pull(subset_training) %>%
first()
#> # A tibble: 90 × 5
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> <dbl> <dbl> <dbl> <dbl> <fct>
#> 1 6.3 3.3 6 2.5 virginica
#> 2 6 2.2 4 1 versicolor
#> 3 5.7 2.8 4.5 1.3 versicolor
#> 4 7.2 3.6 6.1 2.5 virginica
#> 5 5 3.5 1.3 0.3 setosa
#> 6 5.1 3.8 1.6 0.2 setosa
#> 7 7.2 3.2 6 1.8 virginica
#> 8 5.7 4.4 1.5 0.4 setosa
#> 9 4.4 2.9 1.4 0.2 setosa
#> 10 5.2 3.4 1.4 0.2 setosa
#> # … with 80 more rows

Then you get the results you're looking for:

data_set_training %>%
nest(subset_training = everything()) %>%
mutate(iris_recipe = map(
.x = subset_training,
.f = ~recipe(x = .x, Species ~ .)
))
#> # A tibble: 1 × 2
#> subset_training iris_recipe
#> <list> <list>
#> 1 <tibble [90 × 5]> <recipe>

R - using map to apply a list function to dataframe column and create new columns with elements of the list

You can use cbind and str_detect , with map_df:

library(dplyr)
library(purrr)
library(stringr)

cbind(txt, map_dfc(food_list, ~+str_detect(txt$eats, .x))%>%set_names(food_list))

id eats apple oats chocolate
1 1 apple, oats, banana, milk, sugar 1 1 0
2 2 oats, banana, sugar 0 1 0
3 3 chocolate, milk, sugar 0 0 1

Purrr - conditionally mutate a column in a list of data frames when it exists

You can first check whether A and B are in the columnames, if yes then check whether the first element (str_sub(B, 1, 1)) does not match A, if yes then combine A and B

With map_if as suggested by @Moody_Mudskipper

df_ls %>% 
map_if(~ all(c("A", "B") %in% colnames(.x)),
~ mutate(.x, B = if_else(str_sub(B, 1, 1) != A, paste(A, B), B)))

More verbose:

df_ls %>% 
map(~ {if (all(c("A", "B") %in% colnames(.x))) {
.x %>%
mutate(B = if_else(str_sub(B, 1, 1) != A, paste(A, B), B))
} else {
.x
}})

# $df1
# # A tibble: 5 x 3
# id A B
# <int> <chr> <chr>
# 1 1 A A j
# 2 2 B B k
# 3 3 C C l
# 4 4 D D m
# 5 5 E E n
#
# $df2
# # A tibble: 3 x 3
# id A B
# <int> <chr> <chr>
# 1 1 A A j
# 2 2 B B k
# 3 3 C C l
#
# $df3
# # A tibble: 6 x 2
# id B
# <int> <chr>
# 1 1 A j
# 2 2 B k
# 3 3 C l
# 4 4 D m
# 5 5 E n
# 6 6 F o
#
# $df4
# # A tibble: 4 x 2
# id C
# <int> <chr>
# 1 1 O t
# 2 2 P u
# 3 3 Q v
# 4 4 R w

R: How to apply a similar mutate() to multiple data frames with purrr without creating a list?

there is a way that will work with or without purrr.

  1. convert your data.frames to data.tables
  2. use mapply (or the purrr's equivalent) to do the same operation on all tables
  3. you don't care about the output of mapply, because data.tables will be changed without assignment to a new variable
library(data.table)
xk <- as.data.table(xk)
al <- as.data.table(al)
mne <- as.data.table(mne)

mapply(function(x,y) x[,country:=y], x=list(xk,al,mne), y=c("Kosovo","Albania","Montenegro"))

print(xk)

How can I mutate and create a new variable using for loops

Try this:

lapply(hh02, \(x) mutate(x, hhid = xa*10^5 + hoso))

Note that you will see that this returns a list of frames, with the new column added, but it won't change hh02, or the frames initially placed in hh02

If you want to change the initial frames, you could do something like this

hh02 <- c("exp_02","m1_02")
for( h in hh02) {
assign(h, mutate(get(h), hhid = xa*10^5 + hoso))
}

R: Apply mutate() to multiple data frames contained in a list, using each data frame's name as argument

Two possibilities using purrr and dplyr:

dfs %>%
imap(~ mutate(.x, goal = .y))

dfs %>%
map2(names(dfs),
~ mutate(.x, goal = .y))

and one base R way:

lapply(seq_along(dfs), function(n) transform(dfs[[n]], goal = names(dfs)[n]))


Related Topics



Leave a reply



Submit