Combine several data frames in the global environment by row (rbind)
Since you have already read the files in, you can try the following:
do.call(rbind, mget(ls(pattern = "df")))
The ls(pattern = df)
should capture all of your "df.1", "df.2", and so on. Hopefully you don't have other things named with the same pattern, but if you do, experiment with a stricter pattern until the command lists just your data.frame
s.
mget()
will bring all of these into a list
on which you can use do.call(rbind, ...)
.
combining multiple data frames from the global environment
Here is one non-loop approach.
#Create name of dataframes
datanames <- paste0('durData_IBM_AskSide', 1:20)
#Combine them into one
combine_data <- do.call(rbind, mget(datanames))
#Remove them from global environment
rm(list = datanames)
How to merge all data frames in the global environment respectively?
You have to replace x
with your actual id
column in the data to eliminate warning/error messages.
You may do either of two
base R
``` r
master1 <- data.frame(x = LETTERS[1:10], y1 = sample(1:100,10))
master2 <- data.frame(x = LETTERS[1:10], y2 = sample(1:100,10))
master10 <- data.frame(x = LETTERS[1:10], y3 = sample(1:100,10))
DF_obj <- lapply(ls(pattern = ".*master"), get)
gendf <- Reduce(function(.x, .y) merge(.x, .y, by = 'x'), x = DF_obj[-1], init = DF_obj[1])
gendf[, order(names(gendf))]
#> x y1 y2 y3
#> 1 A 37 86 61
#> 2 B 3 23 89
#> 3 C 69 46 95
#> 4 D 16 9 54
#> 5 E 62 85 52
#> 6 F 19 5 35
#> 7 G 55 28 90
#> 8 H 40 52 5
#> 9 I 7 48 100
#> 10 J 48 16 9
tidyverse
master1 <- data.frame(x = LETTERS[1:10], y1 = sample(1:100,10))
master2 <- data.frame(x = LETTERS[1:10], y2 = sample(1:100,10))
master10 <- data.frame(x = LETTERS[1:10], y3 = sample(1:100,10))
DF_obj <- lapply(ls(pattern = ".*master"), get)
library(tidyverse)
purrr::reduce(DF_obj[-1], .init = DF_obj[1], ~ .x %>% as.data.frame() %>% left_join(.y, by = 'x'))
#> x y1 y3 y2
#> 1 A 77 87 93
#> 2 B 10 18 74
#> 3 C 2 89 64
#> 4 D 89 98 5
#> 5 E 13 99 21
#> 6 F 74 25 4
#> 7 G 87 4 22
#> 8 H 62 27 17
#> 9 I 14 10 99
#> 10 J 21 100 78
Created on 2021-05-21 by the reprex package (v2.0.0)
Since the random seed has not been fixed, the results are different in two reprexes.
How do I merge all data frames in the global environment?
Following @Osssan's comments, and assuming that you want to merge everything in your global workspace,
Get the names of the objects and then retrieve the objects themselves into a list:
DF_obj <- lapply(ls(), get)
If you want to merge on all common variables (e.g. if all variable names are unique except the one(s) you want to merge on), then just
Reduce(merge, DF_obj)
should work.
Unfortunately (unlike lapply()
etc.) Reduce
doesn't have a ...
argument for passing additional named arguments to a function, so Reduce(merge, DF_obj, by=common_variable)
doesn't work; as @Osssan points out you need something like
mergefun <- function(x, y) merge(x, y, by= "common_variable")
merged_DF <- Reduce(mergefun, DF_obj )
As other commenters point out, if you just kept the data frames in a list in the first place, you could dispense with the ls()
/get()
step, which is typically clunky/fragile (what if you want to pass the objects back from a function? what if you only want to merge some of the objects in the workspace? ...)
How to bind multiple dataframes in R
If names
store the object names of objects created in the global environment, then we need the value of those. We can get those with mget
in a list
library(dplyr)
bind_rows(mget(names), .id = 'id')
How to rbind all the data.frames in your working environment?
You can search for objects of data.frame
class, and use function mget
to retrieve them.
a = b = c = data.frame(x=1:2, y=3, z=1:4)
d = "junk"
e = list(poo="pah")
ls()
# [1] "a" "b" "c" "d" "e"
dfs = sapply(.GlobalEnv, is.data.frame)
dfs
# a b c d e
# TRUE TRUE TRUE FALSE FALSE
do.call(rbind, mget(names(dfs)[dfs]))
# x y z
# a.1 1 3 1
# a.2 2 3 2
# a.3 1 3 3
# a.4 2 3 4
# b.1 1 3 1
# b.2 2 3 2
# b.3 1 3 3
# b.4 2 3 4
# c.1 1 3 1
# c.2 2 3 2
# c.3 1 3 3
# c.4 2 3 4
Rbind multiple Data Frames in a loop
If file_list
is a character
vector of filenames that have since been loaded into variables in the local environment, then perhaps one of
do.call(rbind.data.frame, mget(ls(pattern = "^df\\s+\\.csv")))
do.call(rbind.data.frame, mget(paste0("df", seq_along(file_list), ".csv")))
The first assumes anything found (as df*.csv
) in R's environment is appropriate to grab. It might not grab then in the correct order, so consider using sort
or somehow ordering them yourself.
mget
takes a string vector and retrieves the value of the object with each name from the given environment (current, by default), returning a list of values.
do.call(rbind.data.frame, ...)
does one call to rbind
, which is much much faster than iteratively rbind
ing.
Related Topics
Converting a Character String into a Date in R
Converting Unit Abbreviations to Numbers
Split the Title Onto Multiple Lines
Removing One Tablegrob When Applied to a Box Plot with a Facet_Wrap
Change Day of the Month in a Date to First Day (01)
Typeof Returns Integer for Something That Is Clearly a Factor
R Fuzzy String Match to Return Specific Column Based on Matched String
Loop in R: How to Save the Outputs
How to Plot Multiple Stacked Histograms Together in R
Split Data Frame into Rows of Fixed Size
Looping Through T.Tests for Data Frame Subsets in R