Merge data frames based on rownames in R
See ?merge
:
the name "row.names" or the number 0 specifies the row names.
Example:
R> de <- merge(d, e, by=0, all=TRUE) # merge by row names (by=0 or by="row.names")
R> de[is.na(de)] <- 0 # replace NA values
R> de
Row.names a b c d e f g h i j k l m n o p q r s
1 1 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10 11 12 13 14 15 16 17 18 19
2 2 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0 0 0 0 0 0 0 0
3 3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 21 22 23 24 25 26 27 28 29
t
1 20
2 0
3 30
Merge dataframes with unequal rows, and no matching column names R
Something like this might work:
df1[df1$TreatyYear %in% df2$TreatyYear, Dates] <- df2$Earned
Example
df <- data.frame(matrix(NA,4,4))
df$X1 <- 1:4
df[df$X1 %in% c(1,2),c("X3","X4")] <- c(1,2)
merge by row.name and column
Here is one option:
merge(df1, df2, by.x = "row.names", by.y = "site")
Row.names x y p q
1 a 1 4 5 10
2 b 2 5 6 11
how to merge or join data frame and keep the row names as well?
I guess you want to cbind
the datasets keeping the rownames. An option using data.table
is
library(data.table) #data.table_1.9.5
dt <- do.call(cbind,lapply(mget(paste0("df",1:3)),
as.data.table, keep.rownames=TRUE))
setnames(dt, seq(2,ncol(dt),by=2), rep('variable',3))
setnames(dt, seq(1,ncol(dt), by=2), paste0('row.names', 1:(ncol(dt)/2)))
head(dt,2)
# row.names1 variable row.names2 variable row.names3 variable
#1: 1 0 1 1 1 1
#2: 2 0 2 1 2 0
Merging dataframes on row.names: column is put in capital letter
If we check the source code of merge.data.frame
, it is creating the Row.names
in cbind
and the 0
based condition is if we specified the by
for 0
i.e. for merging by row (as mentioned in the documentation part -below)
...
else {
if (any(by.x == 0L)) {
x <- cbind(Row.names = I(row.names(x)), x) ####
by.x <- by.x + 1L
}
if (any(by.y == 0L)) {
y <- cbind(Row.names = I(row.names(y)), y) ####
by.y <- by.y + 1L
}
...
The documentation doesn't say much about this except that
the name "row.names" or the number 0 specifies the row names. If specified by name it must correspond uniquely to a named column in the input.
Merge two data frames while keeping the original row order
Check out the join function in the plyr package. It's like merge, but it allows you to keep the row order of one of the data sets. Overall, it's more flexible than merge.
Using your example data, we would use join
like this:
> join(df.2,df.1)
Joining by: class
object class prob
1 A 2 0.7
2 B 1 0.5
3 D 2 0.7
4 F 3 0.3
5 C 1 0.5
Here are a couple of links describing fixes to the merge function for keeping the row order:
http://www.r-statistics.com/2012/01/merging-two-data-frame-objects-while-preserving-the-rows-order/
http://r.789695.n4.nabble.com/patching-merge-to-allow-the-user-to-keep-the-order-of-one-of-the-two-data-frame-objects-merged-td4296561.html
merge 2 dataframes in r with same row names
With merge
, we can use the by
as row.names
out <- merge(df1, df2, by = 'row.names')
If we need to plot, either we can use base R
barplot
barplot(`row.names<-`(as.matrix(out[-1]),
out$Row.names), col = c('blue', 'green', 'red'), legend = TRUE)
Or with tidyverse
library(ggplot2)
library(dplyr)
library(tidyr)
merge(df1, df2, by = 'row.names') %>%
rename(nm = 'Row.names') %>% # // rename the column name
type.convert(as.is = TRUE) %>% # // some columns were not of the correct type
pivot_longer(cols = -nm) %>% # // reshape to 'long' format
ggplot(aes(x = name, y = value, fill = nm)) + # // plot as bar
geom_col() +
theme_bw()
-output
Related Topics
Converting Date to a Day of Week in R
Ggplot2 Bar Plot with Two Categorical Variables
Lme4::Glmer VS. Stata's Melogit Command
R:Convert Nested List into a One Level List
Index Element from List in Rcpp
Using Predict to Find Values of Non-Linear Model
Reading a CSV File with Repeated Row Names in R
How to Change the Order of the Panels in Simple Lattice Graphs
Setting Default Number of Decimal Places for Printing
How to Better Create Stacked Bar Graphs with Multiple Variables from Ggplot2
R: Split Elements of a List into Sublists
Reading in Files with Two Rows for Header
Model Matrix with All Pairwise Interactions Between Columns
Rcmdr Launch Error in Yosemite (Os X 10.10)
How to Replace Outliers with the 5Th and 95Th Percentile Values in R