Cbind 2 Dataframes with Different Number of Rows

cbind 2 dataframes with different number of rows

I think you should instead use merge:

merge(df1, df2, by="year", all = T)

For your data:

df1 = data.frame(matrix(0, 7, 4))
names(df1) = c("year", "avg", "hr", "sal")
df1$year = 2010:2016
df1$avg = c(.3, .29, .275, .280, .295, .33, .315)
df1$hr = c(31, 30, 14, 24, 18, 26, 40)
df1$sal = c(2000, 4000, 600, 800, 1000, 7000, 9000)
df2 = data.frame(matrix(0, 5, 3))
names(df2) = c("year", "pos", "fld")
df2$year = c(2010, 2011, 2013, 2014, 2015)
df2$pos = c('A', 'B', 'C', 'B', 'D')
df2$fld = c(.99,.995,.97,.98,.99)

cbind is meant to column-bind two dataframes that are in all sense compatible. But what you aim to do is actual merge, where you want the elements from the two data frames not be discarded, and for missing values you get NA instead.

cbind a dataframe with an empty dataframe - cbind.fill?

Here's a cbind fill:

cbind.fill <- function(...){
nm <- list(...)
nm <- lapply(nm, as.matrix)
n <- max(sapply(nm, nrow))
do.call(cbind, lapply(nm, function (x)
rbind(x, matrix(, n-nrow(x), ncol(x)))))
}

Let's try it:

x<-matrix(1:10,5,2)
y<-matrix(1:16, 4,4)
z<-matrix(1:12, 2,6)

cbind.fill(x,y)
cbind.fill(x,y,z)
cbind.fill(mtcars, mtcars[1:10,])

I think I stole this from somewhere.

EDIT STOLE FROM HERE: LINK

bind columns with different number of rows

Here's one way. The merge function by design will add NA values whenever you combine data frames and no match is found (e.g., if you have fewer values in 1 data frame than the other data frame).

If you assume that you're matching your data frames (what rows go together) based on the row number, just output the row number as a column in your data frames. Then merge on that column. Merge will automatically add the NA values you want and deal with the fact that the data frames have different numbers of rows.

#test data frame 1
a <- c(1, 3, 2)
b <- c(3, 4, 2)
dat <- as.data.frame(cbind(a, b))

#test data frame 2 (this one has fewer rows than the first data frame)
c <- c(5, 6)
dat.new <- as.data.frame(c)

#add column to each data frame with row number
dat$number <- row.names(dat)
dat.new$number <- row.names(dat.new)

#merge data frames
#"all = TRUE" will mean that NA values will be added whenever there is no match
finaldata <- merge(dat, dat.new, by = "number", all = TRUE)

Binding dataframes of different length (no cbind, no merge)

Edit: In case there are multiple df. Do this

  • Create a list of all dfs except one say first one
  • use purrr::reduce to join all these together
  • pass first df in .init argument.
df2 <- data.frame(m=7:10, n=sample(LETTERS[6:9],4))
df <- data.frame(v=1:5, x=sample(LETTERS[1:5],5))
df3 <- data.frame(bb = 101:110, cc = sample(letters, 10))


reduce(list(df2, df3), .init = df %>% mutate(id = row_number()) , ~full_join(.x, .y %>% mutate(id = row_number()), by = "id" )) %>%
select(-id)

v x m n bb cc
1 1 A 10 I 101 u
2 2 C 9 H 102 v
3 3 D 8 G 103 n
4 4 E 7 F 104 w
5 5 B NA <NA> 105 s
6 NA <NA> NA <NA> 106 y
7 NA <NA> NA <NA> 107 g
8 NA <NA> NA <NA> 108 i
9 NA <NA> NA <NA> 109 p
10 NA <NA> NA <NA> 110 h

Earlier Answer: Create a dummy column id in both dfs and use full_join

full_join(df %>% mutate(id = row_number()), df2 %>% mutate(id = row_number()), by = "id") %>%
select(-id)

v x m n
1 1 A 10 I
2 2 C 9 H
3 3 D 8 G
4 4 E 7 F
5 5 B NA <NA>

Results are different from as expected becuase of different random number seed


Or in BaseR

merge(transform(df, id = seq_len(nrow(df))), transform(df2, id = seq_len(nrow(df2))), all = T)

id v x m n
1 1 1 A 10 I
2 2 2 C 9 H
3 3 3 D 8 G
4 4 4 E 7 F
5 5 5 B NA <NA>

Remove extra column simply by subsetting []

merge(transform(df, id = seq_len(nrow(df))), transform(df2, id = seq_len(nrow(df2))), all = T)[-1]

v x m n
1 1 A 10 I
2 2 C 9 H
3 3 D 8 G
4 4 E 7 F
5 5 B NA <NA>

Bind/Merge Two Data Frames with Differing Number of Rows

It is very easy using set operations with dplyr. Specifically you're looking for full_join.

This function has 3 arguments, #1 & #2 are the dataframes you're looking to join. #3 is the "key" argument which tells the funcion by which column to join the data frames. In this case key = 'Team'.

Merging data frames with different number of rows and different columns

If A and B are the two input data frames, here are some solutions:

1) merge This solutions works regardless of whether A or B has more rows.

merge(data.frame(A, row.names=NULL), data.frame(B, row.names=NULL), 
by = 0, all = TRUE)[-1]

The first two arguments could be replaced with just A and B respectively if A and B have default rownames, i.e. 1, 2, ..., or if they have consistent rownames. That is, merge(A, B, by = 0, all = TRUE)[-1] .

For example, if we have this input:

# test inputs
A <- data.frame(BOD, row.names = letters[1:6])
B <- setNames(2 * BOD[1:2, ], c("X", "Y"))

then:

merge(data.frame(A, row.names=NULL), data.frame(B, row.names=NULL), 
by = 0, all = TRUE)[-1]

gives:

  Time demand  X    Y
1 1 8.3 2 16.6
2 2 10.3 4 20.6
3 3 19.0 NA NA
4 4 16.0 NA NA
5 5 15.6 NA NA
6 7 19.8 NA NA

1a) An equivalent variation is:

do.call("merge", c(lapply(list(A, B), data.frame, row.names=NULL), 
by = 0, all = TRUE))[-1]

2) cbind.zoo This solution assumes that A has more rows and that B's entries are all of the same type, e.g. all numeric. A is not restricted. These conditions hold in the data of the question.

library(zoo)
data.frame(A, cbind(zoo(, 1:nrow(A)), as.zoo(B)))

Combine two data frames with different number of rows in R


df1 <- data.frame(wpt = c(1, "meditate", "meditate", 2,3,"meditate"), 
ID = c(1235, 4562, 0928,6351,3826,0835))
df1$wpt <- as.character(df1$wpt)


df2 <- data.frame(wpt = c(1,2,3),
fuel = c(1235, 4562, 0928),
distance = c(2,3,4))
df2$wpt <- as.character(df2$wpt)


library(dplyr)
full_join(df1, df2, by = "wpt")

Don't mind the values! You can always rearrange the columns.

       wpt   ID fuel distance
1 1 1235 1235 2
2 meditate 4562 NA NA
3 meditate 928 NA NA
4 2 6351 4562 3
5 3 3826 928 4
6 meditate 835 NA NA

Combine two data frames by rows (rbind) when they have different sets of columns

rbind.fill from the package plyr might be what you are looking for.



Related Topics



Leave a reply



Submit