Merging Data Frames with Different Number of Rows and Different Columns

Merging data frames with different number of rows and different columns

If A and B are the two input data frames, here are some solutions:

1) merge This solutions works regardless of whether A or B has more rows.

merge(data.frame(A, row.names=NULL), data.frame(B, row.names=NULL), 
by = 0, all = TRUE)[-1]

The first two arguments could be replaced with just A and B respectively if A and B have default rownames, i.e. 1, 2, ..., or if they have consistent rownames. That is, merge(A, B, by = 0, all = TRUE)[-1] .

For example, if we have this input:

# test inputs
A <- data.frame(BOD, row.names = letters[1:6])
B <- setNames(2 * BOD[1:2, ], c("X", "Y"))

then:

merge(data.frame(A, row.names=NULL), data.frame(B, row.names=NULL), 
by = 0, all = TRUE)[-1]

gives:

  Time demand  X    Y
1 1 8.3 2 16.6
2 2 10.3 4 20.6
3 3 19.0 NA NA
4 4 16.0 NA NA
5 5 15.6 NA NA
6 7 19.8 NA NA

1a) An equivalent variation is:

do.call("merge", c(lapply(list(A, B), data.frame, row.names=NULL), 
by = 0, all = TRUE))[-1]

2) cbind.zoo This solution assumes that A has more rows and that B's entries are all of the same type, e.g. all numeric. A is not restricted. These conditions hold in the data of the question.

library(zoo)
data.frame(A, cbind(zoo(, 1:nrow(A)), as.zoo(B)))

Merge Dataframes with different number of rows

Your dataset is,

dat1 = data.frame("Arable and Horticulture" = c(100, 90,23, 3, 56, 299), 
row.names = c("Acer", "Achillea", "Aesculus", "Alliaria", "Allium", "Anchusa"))

dat2 = data.frame("Improved Grassland" = c(12, 3, 50, 23, 299, 29),
row.names = c("Acer", "Achillea", "Allium", "Brassica", "Calystegia", "Campanula"))

As @Vinícius Félix suggested first convert rownames to column.

library(tibble)
dat1 = rownames_to_column(dat1, "Plants")
dat2 = rownames_to_column(dat2, "Plants")

Then lets join both the datasets,

library(dplyr)
dat = full_join(dat1, dat2, )

And replace the NA with 0

dat = dat %>% replace(is.na(.), 0)

Plants Arable.and.Horticulture Improved.Grassland
1 Acer 100 12
2 Achillea 90 3
3 Aesculus 23 0
4 Alliaria 3 0
5 Allium 56 50
6 Anchusa 299 0
7 Brassica 0 23
8 Calystegia 0 299
9 Campanula 0 29

merge two data frames with different number of rows and repeat same value for same column

We could use left_join this way:

library(dplyr)
left_join(df1, df2, by="id") %>%
select(-ends_with(".y"), num = num.x, age=age.x)
      id   num   age     c     a     b
<dbl> <dbl> <dbl> <dbl> <int> <dbl>
1 1 10 31 95 11 10.5
2 2 20 32 96 12 11.5
3 2 20 32 96 12 11.5
4 3 30 33 97 13 12.5
5 3 30 33 97 13 12.5
6 3 30 33 97 13 12.5
7 5 50 35 99 15 14.5
8 5 50 35 99 15 14.5
9 4 40 34 98 14 13.5
10 4 40 34 98 14 13.5

cbind a dataframe with an empty dataframe - cbind.fill?

Here's a cbind fill:

cbind.fill <- function(...){
nm <- list(...)
nm <- lapply(nm, as.matrix)
n <- max(sapply(nm, nrow))
do.call(cbind, lapply(nm, function (x)
rbind(x, matrix(, n-nrow(x), ncol(x)))))
}

Let's try it:

x<-matrix(1:10,5,2)
y<-matrix(1:16, 4,4)
z<-matrix(1:12, 2,6)

cbind.fill(x,y)
cbind.fill(x,y,z)
cbind.fill(mtcars, mtcars[1:10,])

I think I stole this from somewhere.

EDIT STOLE FROM HERE: LINK

Combine two data frames with different number of rows in R

df1 <- data.frame(wpt = c(1, "meditate", "meditate", 2,3,"meditate"), 
ID = c(1235, 4562, 0928,6351,3826,0835))
df1$wpt <- as.character(df1$wpt)

df2 <- data.frame(wpt = c(1,2,3),
fuel = c(1235, 4562, 0928),
distance = c(2,3,4))
df2$wpt <- as.character(df2$wpt)

library(dplyr)
full_join(df1, df2, by = "wpt")

Don't mind the values! You can always rearrange the columns.

       wpt   ID fuel distance
1 1 1235 1235 2
2 meditate 4562 NA NA
3 meditate 928 NA NA
4 2 6351 4562 3
5 3 3826 928 4
6 meditate 835 NA NA

merging list of data frame with different number of columns in R?

You can use dplyr::bind_rows or data.table::rbindlist

dplyr::bind_rows(data)

# x b d y h a z
#1 1 7 5 4 8 NA NA
#2 4 8 4 7 5 NA NA
#3 1 7 5 4 8 NA NA
#4 NA 8 NA NA NA 87 7

With data.table :

data.table::rbindlist(data, fill = TRUE)

Merge two data frames with different number of rows and columns

I am guessing like this? I am not sure if you want to order Type according to the order in df2..

library(dplyr)
library(tibble)
merge(df1, df2, all=TRUE) %>% group_by(Type) %>% summarise_all(sum,na.rm=TRUE)
# A tibble: 3 x 4
Type `18/19` `16/17` `17/18`
<chr> <dbl> <dbl> <dbl>
1 Apple 7 4 0
2 Banana 7 6 5
3 Pear 6 5 2

If you need to, then you have to do it

rowlvl <- df2$Type
collvl <- colnames(df2)
merge(df1, df2, all=TRUE) %>% select(collvl) %>% mutate(Type=factor(Type,levels=rowlvl)) %>%
group_by(Type) %>% summarise_all(sum,na.rm=TRUE)

# A tibble: 3 x 4
Type `16/17` `17/18` `18/19`
<fct> <dbl> <dbl> <dbl>
1 Apple 4 0 7
2 Pear 5 2 6
3 Banana 6 5 7

Ideal merge for different number of columns and rows in dataframe

In this case, use rbind.fill from the library plyr

library(plyr)
rbind.fill(df1, df2, df3)

This will merge all your 3 data frames with different number of columns.



Related Topics



Leave a reply



Submit