Combine Several Data Frames in the Global Environment by Row (Rbind)

Combine several data frames in the global environment by row (rbind)

Since you have already read the files in, you can try the following:

do.call(rbind, mget(ls(pattern = "df")))

The ls(pattern = df) should capture all of your "df.1", "df.2", and so on. Hopefully you don't have other things named with the same pattern, but if you do, experiment with a stricter pattern until the command lists just your data.frames.

mget() will bring all of these into a list on which you can use do.call(rbind, ...).

combining multiple data frames from the global environment

Here is one non-loop approach.

#Create name of dataframes
datanames <- paste0('durData_IBM_AskSide', 1:20)
#Combine them into one
combine_data <- do.call(rbind, mget(datanames))
#Remove them from global environment
rm(list = datanames)

How to merge all data frames in the global environment respectively?

You have to replace x with your actual id column in the data to eliminate warning/error messages.

You may do either of two

base R

``` r
master1 <- data.frame(x = LETTERS[1:10], y1 = sample(1:100,10))
master2 <- data.frame(x = LETTERS[1:10], y2 = sample(1:100,10))
master10 <- data.frame(x = LETTERS[1:10], y3 = sample(1:100,10))
DF_obj <- lapply(ls(pattern = ".*master"), get)

gendf <- Reduce(function(.x, .y) merge(.x, .y, by = 'x'), x = DF_obj[-1], init = DF_obj[1])

gendf[, order(names(gendf))]
#> x y1 y2 y3
#> 1 A 37 86 61
#> 2 B 3 23 89
#> 3 C 69 46 95
#> 4 D 16 9 54
#> 5 E 62 85 52
#> 6 F 19 5 35
#> 7 G 55 28 90
#> 8 H 40 52 5
#> 9 I 7 48 100
#> 10 J 48 16 9

tidyverse

master1 <- data.frame(x = LETTERS[1:10], y1 = sample(1:100,10))
master2 <- data.frame(x = LETTERS[1:10], y2 = sample(1:100,10))
master10 <- data.frame(x = LETTERS[1:10], y3 = sample(1:100,10))
DF_obj <- lapply(ls(pattern = ".*master"), get)

library(tidyverse)
purrr::reduce(DF_obj[-1], .init = DF_obj[1], ~ .x %>% as.data.frame() %>% left_join(.y, by = 'x'))
#> x y1 y3 y2
#> 1 A 77 87 93
#> 2 B 10 18 74
#> 3 C 2 89 64
#> 4 D 89 98 5
#> 5 E 13 99 21
#> 6 F 74 25 4
#> 7 G 87 4 22
#> 8 H 62 27 17
#> 9 I 14 10 99
#> 10 J 21 100 78

Created on 2021-05-21 by the reprex package (v2.0.0)

Since the random seed has not been fixed, the results are different in two reprexes.

How do I merge all data frames in the global environment?

Following @Osssan's comments, and assuming that you want to merge everything in your global workspace,

Get the names of the objects and then retrieve the objects themselves into a list:

DF_obj <- lapply(ls(), get)

If you want to merge on all common variables (e.g. if all variable names are unique except the one(s) you want to merge on), then just

Reduce(merge, DF_obj)

should work.

Unfortunately (unlike lapply() etc.) Reduce doesn't have a ... argument for passing additional named arguments to a function, so Reduce(merge, DF_obj, by=common_variable) doesn't work; as @Osssan points out you need something like

mergefun <- function(x, y) merge(x, y, by= "common_variable")
merged_DF <- Reduce(mergefun, DF_obj )

As other commenters point out, if you just kept the data frames in a list in the first place, you could dispense with the ls()/get() step, which is typically clunky/fragile (what if you want to pass the objects back from a function? what if you only want to merge some of the objects in the workspace? ...)

How to bind multiple dataframes in R

If names store the object names of objects created in the global environment, then we need the value of those. We can get those with mget in a list

library(dplyr)
bind_rows(mget(names), .id = 'id')

How to rbind all the data.frames in your working environment?

You can search for objects of data.frame class, and use function mget to retrieve them.

a = b = c = data.frame(x=1:2, y=3, z=1:4)
d = "junk"
e = list(poo="pah")
ls()
# [1] "a" "b" "c" "d" "e"
dfs = sapply(.GlobalEnv, is.data.frame)
dfs
# a b c d e
# TRUE TRUE TRUE FALSE FALSE
do.call(rbind, mget(names(dfs)[dfs]))
# x y z
# a.1 1 3 1
# a.2 2 3 2
# a.3 1 3 3
# a.4 2 3 4
# b.1 1 3 1
# b.2 2 3 2
# b.3 1 3 3
# b.4 2 3 4
# c.1 1 3 1
# c.2 2 3 2
# c.3 1 3 3
# c.4 2 3 4

Rbind multiple Data Frames in a loop

If file_list is a character vector of filenames that have since been loaded into variables in the local environment, then perhaps one of

do.call(rbind.data.frame, mget(ls(pattern = "^df\\s+\\.csv")))
do.call(rbind.data.frame, mget(paste0("df", seq_along(file_list), ".csv")))

The first assumes anything found (as df*.csv) in R's environment is appropriate to grab. It might not grab then in the correct order, so consider using sort or somehow ordering them yourself.

mget takes a string vector and retrieves the value of the object with each name from the given environment (current, by default), returning a list of values.

do.call(rbind.data.frame, ...) does one call to rbind, which is much much faster than iteratively rbinding.



Related Topics



Leave a reply



Submit