paste two data.table columns
Arun's comment answered this question:
dt[,new:=paste0(A,B)]
Paste multiple data.table columns into single column based on unique values
We can use do.call(paste
after selecting the column in the order in .SDcols
, removve the duplicate words with a regex expression
dt1[, .(VAR6 = sub(",", " ", gsub("\\b(\\w+)\\b\\s*,\\s*(?=.*\\1)", "",
do.call(paste, c(.SD, sep=",")), perl = TRUE))),
.SDcols = names(dt1)[c(2:1, 3:5)]]
# VAR6
#1: 100 Brick,Place
#2: 23 Sand,Location,Tree
#3: 76 Concrete,Place,Wood
#4: 43 Stone,Vista,Forest
or group by the sequence of rows and do the paste
V6 <- dt1[, sprintf("%s %s, %s", VAR2, VAR1,
toString(unique(unlist(.SD)))), 1:nrow(dt1), .SDcols = VAR3:VAR5]$V1
data.table(V6)
# V6
#1: 100 Brick, Place
#2: 23 Sand, Location, Tree
#3: 76 Concrete, Place, Wood
#4: 43 Stone, Vista, Forest
How to combine two or more columns of data table to one column?
You may use tidyr::unite
-
dt.test <- tidyr::unite(dt.test, date, year:hour, sep = '-')
dt.test
# date id type
#1: 2018-01-01-00 8750 ist
#2: 2018-01-02-01 3048 plan
#3: 2018-01-03-02 3593 ist
#4: 2018-01-04-03 8475 plan
Paste two character columns with `data.table`
Just use sep
as parameter to paste()
instead of collapse
:
dt[, new := paste(A, B, sep = ".")]
dt
# L A B new
#1: 1 g l g.l
#2: 2 h m h.m
#3: 3 i n i.n
#4: 4 j o j.o
#5: 5 k p k.p
paste0()
doesn't honor the sep
parameter (see ?paste0
).
Efficient way to paste multiple column pairs in R data.table
An option with Map
by creating column index with seq
i1 <- seq(1, length(dt)-1, 2)
i2 <- seq(2, length(dt)-1, 2)
dt[, Map(paste,
.SD[, i1, with = FALSE], .SD[, i2, with = FALSE],
MoreArgs = list(sep="-")),
by = "ids"]
Another option would be to split by the names of the dataset and then paste
data.frame(lapply(split.default(dt[, -1, with = FALSE],
sub("\\d+$", "", names(dt)[-1])), function(x) do.call(paste, c(x, sep="-"))))
# x y z
#1 A-1 D-4 G-7
#2 B-2 E-5 H-8
#3 C-3 F-6 I-9
Or another option is with melt/dcast
dcast(melt(dt, id.var = 'ids')[, paste(value, collapse = "-"),
.(grp = sub("\\d+", "", variable), ids)], ids ~ grp, value.var = 'V1')
data.table merge by multiple columns
You can use the statement provided by David Arenburg in comment:
setkey(df1, lsr, ppr)
setkey(df2, li, pro)
df1[df2, alpha := i.alpha]
From the current devel version, 1.9.5, we can perform joins directly without having to set keys using the on
argument:
df1[df2, alpha := i.alpha, on = c(lsr="li", ppr="pro")]
If you don't want to install the devel version, then you can wait until this is pushed as v1.9.6 on CRAN.
Data Table R: Merge selected columns from multiple data.table
Just change the by = "ID"
to by = c("ID", "FDR", "logFC")
and the argument allow.cartesian
should be inside the merge
DT.comb <- Reduce(function(...) merge.data.table(...,
by= c("ID", "FDR", "LogFC"), all = TRUE, allow.cartesian=TRUE), dt.list)
Merge two columns of data table on condition
You can try fcoalesce
if you are working with data.table
> setDT(df)[, lab3 := fcoalesce(lab2, lab1)][]
lab1 lab2 lab3
1: 5 7 7
2: 8 10 10
3: NA 3 3
4: 9 NA 9
5: NA NA NA
Related Topics
Can the Value.Var in Dcast Be a List or Have Multiple Value Variables
Print Pretty Data.Frames/Tables to Console
Select Na in a Data.Table in R
Make Sequential Numeric Column Names Prefixed with a Letter
Converting Nested List (Unequal Length) to Data Frame
Warning Message: Line Appears to Contain Embedded Nulls
Create Binary Column (0/1) Based on Condition in Another Column
Roc Curve from Training Data in Caret
Interactive Directory Input in Shiny App (R)
Convert List of Vectors to Data Frame
Raw Text Strings for File Paths in R
How to Group by Two Columns in R
Equivalent to Rowmeans() for Min()
R: Ggplot Stacked Bar Chart with Counts on Y Axis But Percentage as Label
R Sum a Variable by Two Groups
Matching a Sequence in a Larger Vector