Consolidating Data Frames in R

Combining a list of data frames into a new data frame in R

Note that in your list of dataframes (df_list) all the columns have different names (Area1, Area2, Area3) whereas in your output dataframe they all have been combined into one single column. So for that you need to change the different column names to the same one and bind the dataframes together.

library(dplyr)
library(purrr)

result <- map_df(df_list, ~.x %>%
rename_with(~"Area", contains('Area')), .id = 'FileName')
result

# FileName Area
#1 a1_areaX 100
#2 a2_areaX 200
#3 a3_areaX 300

Combine two data frames based on daily dates in one data frame for panel data

You can join the two dataframes by Date using left_join. You can use the following code:

library(dplyr)
df <- left_join(returns, rf, by = "Date")

Output head(df) looks like this:

  Product_Name       Date       Return   RF
1 A 2018-08-01 NA 0.01
2 A 2018-08-02 -0.021053409 0.01
3 A 2018-08-03 0.005850216 0.01
4 A 2018-08-06 -0.005968756 0.01
5 A 2018-08-07 0.012370563 0.01
6 A 2018-08-08 0.000760790 0.01

Merge data.frame in R

You could use the following. It just row binds the data.frames and in case of duplicates (based on X1) the row of df1 will be removed.

library(dplyr)
df1 <- data.frame(X1 = c("01.01.2000", "01.01.2001", "01.01.2002"),
X2 = c(4, 5, 6), stringsAsFactors = F)
df2 <- data.frame(X1 = c("01.01.2002", "01.01.2003", "01.01.2004"),
X2 = c(8, 9, 10), stringsAsFactors = F)

dfMerged <- bind_rows(df2, df1) %>%
distinct(X1, .keep_all = TRUE) %>%
arrange(X1, X2)

How to join (merge) data frames (inner, outer, left, right)

By using the merge function and its optional parameters:

Inner join: merge(df1, df2) will work for these examples because R automatically joins the frames by common variable names, but you would most likely want to specify merge(df1, df2, by = "CustomerId") to make sure that you were matching on only the fields you desired. You can also use the by.x and by.y parameters if the matching variables have different names in the different data frames.

Outer join: merge(x = df1, y = df2, by = "CustomerId", all = TRUE)

Left outer: merge(x = df1, y = df2, by = "CustomerId", all.x = TRUE)

Right outer: merge(x = df1, y = df2, by = "CustomerId", all.y = TRUE)

Cross join: merge(x = df1, y = df2, by = NULL)

Just as with the inner join, you would probably want to explicitly pass "CustomerId" to R as the matching variable. I think it's almost always best to explicitly state the identifiers on which you want to merge; it's safer if the input data.frames change unexpectedly and easier to read later on.

You can merge on multiple columns by giving by a vector, e.g., by = c("CustomerId", "OrderId").

If the column names to merge on are not the same, you can specify, e.g., by.x = "CustomerId_in_df1", by.y = "CustomerId_in_df2" where CustomerId_in_df1 is the name of the column in the first data frame and CustomerId_in_df2 is the name of the column in the second data frame. (These can also be vectors if you need to merge on multiple columns.)

Merge two data frames by one column with unique values

One way to get this would be (using the same df1 and df2 as provided)

require(tidyverse)
df3 <- unique((inner_join(df1, select(df2, c("a","c")), by = c("a"))))

I used the inner join originally but left_join would work as well

Another way of doing would be to create a subset of df2

df2b <- df2 %>% 
select(a,c) %>%
unique()
df3b <- left_join(df1, df2b, by="a")

Combining two dataframes with alternating column position

We can use the matrix route to bind the column names into a dim structure and then concatenate (c)

library(dplyr)
bind_cols(df1, df2) %>%
dplyr::select(all_of(c(matrix(names(.), ncol = 3, byrow = TRUE))))

-output

# A tibble: 4 × 6
b b_B a a_A c c_C
<int> <int> <int> <int> <int> <int>
1 1 1 1 1 1 1
2 2 2 2 2 2 2
3 3 3 3 3 3 3
4 4 4 4 4 4 4

R merge two dataframes with same columns without replacing values

Solution:

Thanks to @GregorThomas for providing the answer.

This problem was solved with the following command:

merge(data1, data2, all = TRUE)


Related Topics



Leave a reply



Submit