Add Column Containing Data Frame Name to a List of Data Frames

Add column containing data frame name to a list of data frames

Your list is unnamed. You can either make it named manually while creating it

my_list = list(data = data, data2 = data2, data3 = data3)

Or you can use mget & ls combination if you have many data sets

my_list <- mget(ls(pattern = "^data$|^data\\d+$"))

Afterwords, just use Map

my_list <- Map(cbind, my_list, new_clumn = names(my_list))
my_list
# $data
# column1 column2 new_clumn
# 1 12 27 data
# 2 27 987 data
# 3 378 1234 data
#
# $data2
# column1 column2 new_clumn
# 1 12 27 data2
# 2 27 987 data2
# 3 378 1234 data2
#
# $data3
# column1 column2 new_clumn
# 1 12 27 data3
# 2 27 987 data3
# 3 378 1234 data3

#If you want to put the data sets back to the global environment you can use `list2env`
#list2env(my_list, .GlobalEnv)
#Please Note that it is usually not the preffered practice to move data frames to the global environment and back. It is preferred to store all you data sets in list from the very beginning and manipulating them within the list using functions such as `Map`, `lapply`, etc.

Append a new column using the name of dataframe from list of dataframe in R

Method 1: tidyverse solution with map2

library(tidyverse)
map2(l, names(l), ~ mutate(.x, new_col = .y))

Output:

$a
a_1 a_2 new_col
1 11 13 a
2 12 14 a

$b
b_1 b_2 new_col
1 21 23 b
2 22 24 b

$c
c_1 c_2 new_col
1 31 33 c
2 32 34 c

Method 2: for loop solution

(l gives the above output):

for(i in 1:length(names(l))) {
l[[i]]$new_col <- names(l)[i]
}

Add new column name to a list of data frames from a part of the file name using lapply

Name the list of filenames using setNames(), then use the .id argument in bind_rows(), which adds a column containing list names.

library(tidyverse)
library(readxl)

files <- list.files(path ="Users/Desktop/week", pattern = "*.xlsx", full.names= T) %>%
setNames(nm = .) %>%
lapply(read_excel, sheet =4, skip =39) %>%
bind_rows(.id = "Week") %>%
mutate(Week = str_extract(Week, "wk\\d+"))

You could also combine the iteration and row-binding steps using purrr::map_dfr():

files <- list.files(path ="Users/Desktop/week", pattern = "*.xlsx", full.names= T) %>%
setNames(nm = .) %>%
map_dfr(read_excel, sheet = 4, skip = 39, .id = "Week") %>%
mutate(Week = str_extract(Week, "wk\\d+"))

Adding a column to every dataframe in a list with the name of the list element

there is no need to mutate just bind using dplyr's bind_rows

library(tidyverse)
my.list %>%
bind_rows(.id = "groups")

Obviously requires that the list is named.

Dataframe name to column in list of dataframes using purrr

Another way is to use lst instead of list which automatically names the list for you with imap which uses these names directly (.y).

library(tidyverse)
my_list <- lst(batch_1, batch_2, batch_3)
purrr::imap(my_list, ~mutate(.x, batch = .y))

# $batch_1
# A B batch
# 1 1 4 batch_1
# 2 2 5 batch_1
# 3 3 6 batch_1

# $batch_2
# A B batch
# 1 1 4 batch_2
# 2 2 5 batch_2
# 3 3 6 batch_2

# $batch_3
# A B batch
# 1 1 4 batch_3
# 2 2 5 batch_3
# 3 3 6 batch_3

Add new column to each data.frame in list of data.frames

You just need to make your function returns the data frame:

foldfunc <- function(x) {
folds <- createFolds(1:nrow(x), k=10,list = F)
x$folds <- folds
return(x)
}

In your code, your function is returning the folds. Since you weren't explicitly saying what to return, the function assumes that the desired result is the last thing it calculates, and that's why you were receiving numerical vectors (with the folds calculated by createFolds).

If you try print(foldfunc(listdf[[1]])) with your function, you will see this:

print(foldfunc(listdf[[1]]))
# [1] 1 2 3 4 5 6 7 8 9 10

With the new version, a data frame with a folds column will be provided.

R Add column into data.frame, that is in list of data.frames

You can use :

library(dplyr)
library(purrr)

hp50uppct <- myuptop %>%
mutate(aPrices = map2(aPrices, topClose,
~{.x$topPct = .x$Close/.y * 100;.x}))

glimpse(hp50uppct)
#Rows: 2
#Columns: 5
#$ symbol <chr> "IBM", "MMM"
#$ aPrices <list> [<tbl_df[171 x 8]>, <tbl_df[171 x 8]>]
#$ topNdx <dbl> 89, 73
#$ topDate <date> 2020-02-06, 2020-01-14
#$ topClose <dbl> 157, 181

Or using base R :

myuptop$aPrices <- Map(function(x, y) {x$topPct = x$Close/y * 100;x}, 
myuptop$aPrices, myuptop$topClose)

Combining a list of data frames into a new data frame in R

Note that in your list of dataframes (df_list) all the columns have different names (Area1, Area2, Area3) whereas in your output dataframe they all have been combined into one single column. So for that you need to change the different column names to the same one and bind the dataframes together.

library(dplyr)
library(purrr)

result <- map_df(df_list, ~.x %>%
rename_with(~"Area", contains('Area')), .id = 'FileName')
result

# FileName Area
#1 a1_areaX 100
#2 a2_areaX 200
#3 a3_areaX 300

Add columns to specific rows in a list of data frames from a data frame in R

There are 6 elements (data.frame) in the list multidf and there are 6 rows in the matrix 'df3'. So, if the intention is to create two new columns in each of the list elements

multidf2 <- Map(cbind, multidf, lapply(asplit(df3, 1), as.list))

Or with apply

multidf2 <- Map(cbind, multidf, apply(df3, 1, as.list))

If we need to create only two blank columns with column names as in 'df3' rows

multidf2 <-  Map(function(x, y) {x[y] <- ''; x}, multidf, asplit(df3, 1))


Related Topics



Leave a reply



Submit