Merging Multiple Data.Tables

Merging multiple data.tables

Not sure, but maybe (untested) :

Reduce(merge,list(DT1,DT2,DT3,...))

Chaining multiple data.table::merge operations with data.tables

Multiple data.table joins with the on argument can be chained. Note that without an update operator (":=") in j, this would be a right join, but with ":=" (i.e., adding columns), this becomes a left outer join. A useful post on left joins here Left join using data.table.

Example using example data above with a subset between joins:

dt4 <- dt1[dt2, on="food", `:=`(status = i.status)][
food == "apples"][dt3, on="food", rank := i.rank]

##> dt4
## food quantity status rank
##1: apples 1 good okay

Example adding new column between joins

dt4 <- dt1[dt2, on="food", `:=`(status = i.status)][
, new_col := NA][dt3, on="food", rank := i.rank]

##> dt4
## food quantity status new_col rank
##1: apples 1 good NA okay
##2: bananas 2 bad NA good
##3: carrots 3 rotten NA better
##4: dates 4 raw NA best

Example using merge and magrittr pipes:

dt4 <-  merge(dt1, dt2, by = "food") %>%
set( , "new_col", NA) %>%
merge(dt3, by = "food")

##> dt4
## food quantity status new_col rank
##1: apples 1 good NA okay
##2: bananas 2 bad NA good
##3: carrots 3 rotten NA better
##4: dates 4 raw NA best

R data.table: How can I merge a list of data.tables?

We can use join on

library(data.table)
na.omit(Reduce(function(x, y) x[y, on = .(V1)], dtl))

r merge multiple data tables using lists of data table names

I think this should work:

combined.sites <- Reduce(merge,lapply(allSites,get))

Let me know if it doesn't.

How to merge a list of data.tables without getting splitted columns?

The by argument in base::merge defaults to intersect(names(x), names(y)) where x and y are the 2 tables to be merged. Hence, base::merge also uses V3 as the merging key.

The by argument in data.table::merge defaults to the shared key columns between the two tables (i.e. sid and id in this case). And since the tables have columns named V3, suffixes are appended to the new columns.

So if your intention is to merge by all common columns, you can identify the common columns, set keys then merge:

commcols <- Reduce(intersect, lapply(L, names))
L.dt <- lapply(L, function(x) setkeyv(data.table(x), commcols))
M2 <- Reduce(function(...) merge(..., all=TRUE), L.dt)

Data Table R: Merge selected columns from multiple data.table

Just change the by = "ID" to by = c("ID", "FDR", "logFC") and the argument allow.cartesian should be inside the merge

DT.comb <- Reduce(function(...) merge.data.table(...,
by= c("ID", "FDR", "LogFC"), all = TRUE, allow.cartesian=TRUE), dt.list)

merge multiple table with different length and form a single table in R

According to ?merge, it allows only two datasets at a time for joining

merge(x, y, ...)

where

x, y - data frames, or objects to be coerced to one.

An option is to place the datasets in a list and use Reduce to do a sequential join

lst1 <- list(study, genotyping_techs_table, platforms_table,
ancestries_table, ancestral_groups_table,
countries_of_recruitment_table,
countries_of_origin_table ,publication)

out <- Reduce(function(...) merge(..., by = "study_id" , all = TRUE), lst1)


Related Topics



Leave a reply



Submit