Create a variable that identifies the original data.frame after rbind command in R
There's a function in the gdata
package called combine
that does just that.
df1 <- data.frame(a = seq(1, 5, by = 1),
b = seq(21, 25, by = 1))
df2 <- data.frame(a = seq(6, 10, by = 1),
b = seq(26, 30, by = 1))
library(gdata)
combine(df1, df2)
a b source
1 1 21 df1
2 2 22 df1
3 3 23 df1
4 4 24 df1
5 5 25 df1
6 6 26 df2
7 7 27 df2
8 8 28 df2
9 9 29 df2
10 10 30 df2
Combine (rbind) data frames and create column with name of original data frames
It's not exactly what you asked for, but it's pretty close. Put your objects in a named list and use do.call(rbind...)
> do.call(rbind, list(df1 = df1, df2 = df2))
x y
df1.1 1 2
df1.2 3 4
df2.1 5 6
df2.2 7 8
Notice that the row names now reflect the source data.frame
s.
Update: Use cbind
and rbind
Another option is to make a basic function like the following:
AppendMe <- function(dfNames) {
do.call(rbind, lapply(dfNames, function(x) {
cbind(get(x), source = x)
}))
}
This function then takes a character vector of the data.frame
names that you want to "stack", as follows:
> AppendMe(c("df1", "df2"))
x y source
1 1 2 df1
2 3 4 df1
3 5 6 df2
4 7 8 df2
Update 2: Use combine
from the "gdata" package
> library(gdata)
> combine(df1, df2)
x y source
1 1 2 df1
2 3 4 df1
3 5 6 df2
4 7 8 df2
Update 3: Use rbindlist
from "data.table"
Another approach that can be used now is to use rbindlist
from "data.table" and its idcol
argument. With that, the approach could be:
> rbindlist(mget(ls(pattern = "df\\d+")), idcol = TRUE)
.id x y
1: df1 1 2
2: df1 3 4
3: df2 5 6
4: df2 7 8
Update 4: use map_df
from "purrr"
Similar to rbindlist
, you can also use map_df
from "purrr" with I
or c
as the function to apply to each list element.
> mget(ls(pattern = "df\\d+")) %>% map_df(I, .id = "src")
Source: local data frame [4 x 3]
src x y
(chr) (int) (int)
1 df1 1 2
2 df1 3 4
3 df2 5 6
4 df2 7 8
do.call(rbind, list(data, frames)) but also index each row by its original data frame
Here is one way.
library(dplyr)
library(tidyr)
foo <- list(df1, df2)
unnest(foo, names) %>%
mutate(names = gsub("^X", "", names))
# names a b
#1 1 1 3
#2 1 2 4
#3 2 5 7
#4 2 6 8
Combine two data frames by rows (rbind) when they have different sets of columns
rbind.fill
from the package plyr
might be what you are looking for.
R rbind while preserving order or rows in each data frame
Try this one-liner
do.call("rbind", Map("rbind", split(x, 1:nrow(x)), split(y, 1:nrow(y))))
which gives this data.frame if x
and y
are as in the question:
a b c
1.1 1 2 3
1.2 10 20 30
2.2 2 3 4
2.21 20 30 40
3.3 3 4 5
3.31 30 40 50
It splits each data frame by row and then will rbind corresponding components of the splits. Then it rbinds all that. Note that this one-liner works even if the columns have different types. For example it will work even if:
x <- data.frame(a = letters[1:3], b = 1:3, c = c(TRUE, FALSE, TRUE))
y <- data.frame(a = LETTERS[1:3], b = 11:13, c = c(FALSE, TRUE, FALSE))
In R, reorganize list based on element names (rbind and indicator variable)
It sounds like you're doing a lot of gymnastics because you have a specific form in mind. What I would suggest is first trying to make the data tidy. Without reading the link, the quick summary is to put your data into a single data frame, where it can be easily processed.
The quick version of the answer (here I've used lst
instead of list
for the name to avoid confusion with the built-in list
) is to do this:
do.call(rbind,
lapply(seq(lst), function(i) {
lst[[i]]$type <- names(lst)[i]; lst[[i]]
})
)
What this will do is create a single data frame, with a column, "type", that contains the name of the list item in which that row appeared.
Using a slightly simplified version of your initial data:
lst <- list(A1=data.frame(x=rnorm(5)), A2=data.frame(x=rnorm(3)), B=data.frame(x=rnorm(5)))
lst
$A1
x
1 1.3386071
2 1.9875317
3 0.4942179
4 -0.1803087
5 0.3094100
$A2
x
1 -0.3388195
2 1.1993115
3 1.9524970
$B
x
1 -0.1317882
2 -0.3383545
3 0.8864144
4 0.9241305
5 -0.8481927
And then applying the magic function
df <- do.call(rbind,
lapply(seq(lst), function(i) {
lst[[i]]$type <- names(lst)[i]; lst[[i]]
})
)
df
x type
1 1.3386071 A1
2 1.9875317 A1
3 0.4942179 A1
4 -0.1803087 A1
5 0.3094100 A1
6 -0.3388195 A2
7 1.1993115 A2
8 1.9524970 A2
9 -0.1317882 B
10 -0.3383545 B
11 0.8864144 B
12 0.9241305 B
13 -0.8481927 B
From here we can process to our hearts content; with operations like df$subject <- gsub("[0-9]*", "", df$type)
to extract the non-numeric portion of type
, and tools like split
can be used to generate the sub-lists that you mention in your question.
In addition, once it is in this form, you can use functions like by
and aggregate
or libraries like dplyr
or data.table
to do more advanced split-apply-combine operations for data analysis.
Combine loop with Rbind
You can get and assign variables by their names assuming your data frames are stored in the R global environment:
library(tidyverse)
x <- c(1,2,3)
y <- c(1,2,3)
df_a_1 <- data.frame(x,y)
df_a_2 <- data.frame(x,y)
df_b_1 <- data.frame(x,y)
df_b_2 <- data.frame(x,y)
df_c_1 <- data.frame(x,y)
df_c_2 <- data.frame(x,y)
letters <- c("a", "b", "c")
for(l in letters) {
prefix <- str_glue("df_{l}")
res <- names(globalenv()) %>%
keep(~ .x %>% str_detect(prefix)) %>%
map(get) %>%
reduce(rbind)
assign(prefix, res)
}
df_a
#> x y
#> 1 1 1
#> 2 2 2
#> 3 3 3
#> 4 1 1
#> 5 2 2
#> 6 3 3
df_b
#> x y
#> 1 1 1
#> 2 2 2
#> 3 3 3
#> 4 1 1
#> 5 2 2
#> 6 3 3
Created on 2021-11-10 by the reprex package (v2.0.1)
Related Topics
How to Find Correct Executable with Sys.Which on Windows
Plot a Function with Several Arguments in R
How to Read Column Names 'As Is' from CSV File
Efficient Way to Fill Time-Series Per Group
R: How to Judge Date in the Same Week
How to Shift X Axis Positions of Two Geoms Relative to Each Other
Add a Series of Elements in Different Locations Within a Vector
Adding Grouped Mean Values to Column in Data Frame
Converting Multiple Existing Xts Objects to Multiple Data.Frames
How to Read Large Numbers Precisely in R and Perform Arithmetic on Them
Add a Constant Value to All Rows in a Dataframe
Calculating the Distance Between Points in Different Data Frames
How to Use User Input to Obtain a Data.Frame from My Environment in Shiny
Replace Na with Grouped Means in R
Multiplying Combinations of a List of Lists in R