Merge dataframes, different lengths
You could add a join variable to dat2 then using merge:
dat2$variable <- rownames(dat2)
merge(dat1, dat2)
variable ID value concreteness familiarity typicality
1 amoeba 1 0 3.60 1.30 1.71
2 amoeba 2 0 3.60 1.30 1.71
3 amoeba 3 NA 3.60 1.30 1.71
4 bacterium 1 0 3.82 3.48 2.13
5 bacterium 2 0 3.82 3.48 2.13
6 bacterium 3 0 3.82 3.48 2.13
7 leech 1 1 5.71 1.83 4.50
8 leech 2 1 5.71 1.83 4.50
9 leech 3 0 5.71 1.83 4.50
Merge multiple data.frames in R with varying row length
You want to merge the result with df3
, i.e.:
merge(df3, merge(df1, df2, by="Year", all.x=TRUE, all.y=TRUE), by = "Year", all.x = TRUE, all.y = TRUE)
# Year Site3 Site1 Site2
#1 2006 <NA> 2.3 <NA>
#2 2007 <NA> 1 2.7
#3 2008 1.3 3.1 4.1
#4 2009 2 2.9 1.1
#5 2010 3.6 1.4 2.6
#6 2011 1.7 3 3.1
Or if you have your data.frame
's in a list, use Reduce
to generalize the above:
Reduce(function(x,y) merge(x, y, by = "Year", all.x = TRUE, all.y = TRUE),
list(df1, df2, df3))
# Year Site1 Site2 Site3
#1 2006 2.3 <NA> <NA>
#2 2007 1 2.7 <NA>
#3 2008 3.1 4.1 1.3
#4 2009 2.9 1.1 2
#5 2010 1.4 2.6 3.6
#6 2011 3 3.1 1.7
How can I combine two dataframes with different lengths in R?
Try using left_join in the dplyr package.
library(dplyr)
# make fake data
df1 <- data.frame(id = c("A", "B", "C", "D", "E"), val = rpois(5, 5))
df2 <- data.frame(id = c("A", "B", "C", "E"), val = rpois(4, 20))
# use left_join
df3 <- left_join(df1, df2, by = "id")
# rename and set NAs to 0
names(df3) <- c("id", "val", "val")
df3[is.na(df3)] <- 0
how to combine two data frames of different lengths?
This is too long for a comment, but really just need to demonstrate that the solution I gave in comments does work. If you are having problems with getting merge to work, then there must be some other issue with your data, which we cannot diagnose because you did not provide a dput of your data.frames
df1 = read.table(text =
"Date Duration
6/27/2014 10.00
6/30/2014 20.00
7/11/2014 15.00",
header = T)
df2 = read.table(text =
"Date Percent_Removal
6/27/2014 20.39
6/30/2014 27.01
7/7/2014 49.84
7/11/2014 59.48
7/17/2014 99.04",
header = T)
df1$Date <- as.Date (df1$Date, format= "%m/%d/%Y")
df2$Date <- as.Date (df2$Date, format= "%m/%d/%Y")
df3 = merge(df1,df2)
# Date Duration Percent_Removal
# 1 2014-06-27 10 20.39
# 2 2014-06-30 20 27.01
# 3 2014-07-11 15 59.48
Note that no additional options need to be specified in the merge statement because
- The default value
by =
is the column names that are common to both data frames. In this case, onlyDate
is shared. - the default values of
all.x
,all.y
andall
give the desired behaviour where only the rows that are in both data frames are kept.
Merge many R data frames by row.names with differing lengths
We could get all the datasets into a list
and use merge
with Reduce
specifying the by
as a new column created from the row names
lst1 <- lapply(mget(ls(pattern = '^df\\d+$')), \(x)
transform(x, rn =row.names(x)))
out <- Reduce(function(...) merge(..., by = 'rn', all = TRUE),
lst1)
row.names(out) <- out[[1]]
out <- out[-1]
-output
out
v1 v2 v3
chr1 10 6 NA
chr2 43 64 20
chr3 1 NA 30
chr4 44 21 40
chr5 598 98 50
chr6 NA 10 60
chr7 NA 20 70
Or using tidyverse
with full_join
after creating a row names column with rownames_to_column
(from tibble
)
library(dplyr)
library(tibble)
library(purrr)
mget(ls(pattern = '^df\\d+$')) %>%
map(~ .x %>%
rownames_to_column('rn')) %>%
reduce(full_join, by = 'rn') %>%
column_to_rownames("rn")
v1 v2 v3
chr1 10 6 NA
chr2 43 64 20
chr3 1 NA 30
chr4 44 21 40
chr5 598 98 50
chr6 NA 10 60
chr7 NA 20 70
Merging data frames of different row length in R
You can achieve the desired result by using merge
:
merge(df.A,df.B,by='Category',all=T)
which will produce the following output:
# Category Number.x Number.y
#1 A 1 5
#2 B 2 6
#3 C 3 7
#4 D 4 NA
Merging data frames with different number of rows and different columns
If A and B are the two input data frames, here are some solutions:
1) merge This solutions works regardless of whether A or B has more rows.
merge(data.frame(A, row.names=NULL), data.frame(B, row.names=NULL),
by = 0, all = TRUE)[-1]
The first two arguments could be replaced with just A and B respectively if A and B have default rownames, i.e. 1, 2, ..., or if they have consistent rownames. That is, merge(A, B, by = 0, all = TRUE)[-1]
.
For example, if we have this input:
# test inputs
A <- data.frame(BOD, row.names = letters[1:6])
B <- setNames(2 * BOD[1:2, ], c("X", "Y"))
then:
merge(data.frame(A, row.names=NULL), data.frame(B, row.names=NULL),
by = 0, all = TRUE)[-1]
gives:
Time demand X Y
1 1 8.3 2 16.6
2 2 10.3 4 20.6
3 3 19.0 NA NA
4 4 16.0 NA NA
5 5 15.6 NA NA
6 7 19.8 NA NA
1a) An equivalent variation is:
do.call("merge", c(lapply(list(A, B), data.frame, row.names=NULL),
by = 0, all = TRUE))[-1]
2) cbind.zoo This solution assumes that A has more rows and that B's entries are all of the same type, e.g. all numeric. A is not restricted. These conditions hold in the data of the question.
library(zoo)
data.frame(A, cbind(zoo(, 1:nrow(A)), as.zoo(B)))
Related Topics
Caret Error: "All the Accuracy Metric Values Are Missing"
Manual Simulation of Markov Chain in R
Place 1 Heatmap on Another with Transparency in R
How to Load Any Package in R (Unable to Load Shared Object)
Change Date Print Format from Yyyy-Mm-Dd to Dd-Mm-Yyyy
Download .Rdata and .CSV Files from Ftp Using Rcurl (Or Any Other Method)
How Can One Mix 2 or More Color Palettes to Show a Combined Color Value
Convert from N X M Matrix to Long Matrix in R
R Shiny - Ui.R Seems to Not Recognize a Dataframe Read by Server.R
Highlight a Single "Bar" in Ggplot
Modify Spacing Between Key Glyphs in Vertical Legend Whilst Keeping Key Glyph Border
Select a Sequence of Columns: ':' Works But Not 'Seq'
As.Date Produces Unexpected Result in a Sequence of Week-Based Dates