Paste Two Data.Table Columns

paste two data.table columns

Arun's comment answered this question:

dt[,new:=paste0(A,B)]

Paste multiple data.table columns into single column based on unique values

We can use do.call(paste after selecting the column in the order in .SDcols, removve the duplicate words with a regex expression

dt1[,  .(VAR6 = sub(",", " ", gsub("\\b(\\w+)\\b\\s*,\\s*(?=.*\\1)", "", 
do.call(paste, c(.SD, sep=",")), perl = TRUE))),
.SDcols = names(dt1)[c(2:1, 3:5)]]
# VAR6
#1: 100 Brick,Place
#2: 23 Sand,Location,Tree
#3: 76 Concrete,Place,Wood
#4: 43 Stone,Vista,Forest

or group by the sequence of rows and do the paste

V6 <- dt1[, sprintf("%s %s, %s", VAR2, VAR1, 
toString(unique(unlist(.SD)))), 1:nrow(dt1), .SDcols = VAR3:VAR5]$V1
data.table(V6)
# V6
#1: 100 Brick, Place
#2: 23 Sand, Location, Tree
#3: 76 Concrete, Place, Wood
#4: 43 Stone, Vista, Forest

How to combine two or more columns of data table to one column?

You may use tidyr::unite -

dt.test <- tidyr::unite(dt.test, date, year:hour, sep = '-')
dt.test

# date id type
#1: 2018-01-01-00 8750 ist
#2: 2018-01-02-01 3048 plan
#3: 2018-01-03-02 3593 ist
#4: 2018-01-04-03 8475 plan

Paste two character columns with `data.table`

Just use sep as parameter to paste() instead of collapse:

dt[, new := paste(A, B, sep = ".")]
dt
# L A B new
#1: 1 g l g.l
#2: 2 h m h.m
#3: 3 i n i.n
#4: 4 j o j.o
#5: 5 k p k.p

paste0() doesn't honor the sep parameter (see ?paste0).

Efficient way to paste multiple column pairs in R data.table

An option with Map by creating column index with seq

i1 <- seq(1, length(dt)-1, 2)
i2 <- seq(2, length(dt)-1, 2)
dt[, Map(paste,
.SD[, i1, with = FALSE], .SD[, i2, with = FALSE],
MoreArgs = list(sep="-")),
by = "ids"]

Another option would be to split by the names of the dataset and then paste

data.frame(lapply(split.default(dt[, -1, with = FALSE],
sub("\\d+$", "", names(dt)[-1])), function(x) do.call(paste, c(x, sep="-"))))
# x y z
#1 A-1 D-4 G-7
#2 B-2 E-5 H-8
#3 C-3 F-6 I-9

Or another option is with melt/dcast

dcast(melt(dt, id.var = 'ids')[,  paste(value, collapse = "-"),
.(grp = sub("\\d+", "", variable), ids)], ids ~ grp, value.var = 'V1')

data.table merge by multiple columns

You can use the statement provided by David Arenburg in comment:

setkey(df1, lsr, ppr)
setkey(df2, li, pro)
df1[df2, alpha := i.alpha]

From the current devel version, 1.9.5, we can perform joins directly without having to set keys using the on argument:

df1[df2, alpha := i.alpha, on = c(lsr="li", ppr="pro")]

If you don't want to install the devel version, then you can wait until this is pushed as v1.9.6 on CRAN.

Data Table R: Merge selected columns from multiple data.table

Just change the by = "ID" to by = c("ID", "FDR", "logFC") and the argument allow.cartesian should be inside the merge

DT.comb <- Reduce(function(...) merge.data.table(...,
by= c("ID", "FDR", "LogFC"), all = TRUE, allow.cartesian=TRUE), dt.list)

Merge two columns of data table on condition

You can try fcoalesce if you are working with data.table

> setDT(df)[, lab3 := fcoalesce(lab2, lab1)][]
lab1 lab2 lab3
1: 5 7 7
2: 8 10 10
3: NA 3 3
4: 9 NA 9
5: NA NA NA


Related Topics



Leave a reply



Submit