Multiply Many Columns by a Specific Other Column in R with Data.Table

Multiply many columns by a specific other column in R with data.table?

You could try

DT[, (inc_cols) := lapply(.SD, function(x) 
x * DT[['deflator']] ), .SDcols = inc_cols]
head(DT1,2)
# id year inc1 inc2 inc3 deflator
#1: 1 3 0.614838304 0.009796974 0.3236051 0.7735552
#2: 2 2 -0.001583579 -0.082289606 -0.1365115 -0.6644330

Or if you need a loop

for(inc in inc_cols){
nm1 <- as.symbol(inc)
DT[,(inc):= eval(nm1)*deflator]
}

head(DT,2)
# id year inc1 inc2 inc3 deflator
#1: 1 3 0.614838304 0.009796974 0.3236051 0.7735552
#2: 2 2 -0.001583579 -0.082289606 -0.1365115 -0.6644330

Or a possible option using set which should be very fast as the overhead of [.data.table is avoided (suggested by @Arun)

indx <- grep('inc', colnames(DT))

for(j in indx){
set(DT, i=NULL, j=j, value=DT[[j]]*DT[['deflator']])
}
head(DT,2)
# id year inc1 inc2 inc3 deflator
#1: 1 3 0.614838304 0.009796974 0.3236051 0.7735552
#2: 2 2 -0.001583579 -0.082289606 -0.1365115 -0.6644330

where

inc_cols <-  grep('^inc', colnames(DT), value=TRUE)

data

set.seed(24)
DT <- data.table(id=1:1000,year=round(runif(1000)*10),
inc1 = runif(1000), inc2 = runif(1000), inc3 = runif(1000),
deflator = rnorm(1000))

How to multiply columns from two different data.table by a matching condition?

If you just want to return the multiplied vector, like so:

require(data.table)

DT1 <- structure(list(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), ratio = c(0.3,
0.2, 0.4, 0.1, 0.7, 0.3, 0.5, 0.9, 0.1, 0.4)), class = "data.frame", row.names = c(NA,
-10L))

DT2 <- structure(list(ID = 1:10, number = c(NA, NA, 488L, NA, NA, 600L,
789L, 503L, NA, NA)), class = "data.frame", row.names = c(NA,
-10L))

setDT(DT1)
setDT(DT2)

DT1$ratio[match(DT2$ID, DT1$ID)] * DT2$number

Note the order of the match.

Multiply column by every other column

We can use combn to create combination of names of dataframe taken 2 at a time and then write a custom function which subsets the dataframe and multiply it with each other.

combn(names(df1), 2, function(x) df1[x[1]] * df1[x[2]], simplify = FALSE)

This command returns a list of 6 dataframes (a*b, a*c, a*d, b*c, b*d, c*d) for the given example.

Multiplying all columns in dataframe by single column

Also try

df1 * t(C)
# F1 F2 F3
#1 2.0 2.0 2.0
#2 5.0 5.0 5.0
#3 16.0 16.0 16.0
#4 4.5 4.5 4.5

When we try to multiply data frames they must be of the same size.

df1 * C

error in Ops.data.frame(df1, C) :
‘*’ only defined for equally-sized data frames

t() turns C into a matrix, i.e. a vector with dimension attribute of length 4. This vector gets recycled when we multiply it with df1.

What would also work in the same way (and might be faster than transposing C):

df1 * C$C

or

df1 * unlist(C)

Bin multiple columns in a data.table with respect to values in another column

You can put functions/expressions in the by argument:

my_dt_melt[, list(maxcount = max(count), sumcount = sum(count)),
by = .(
range = cut(
experiment,
c(0,4,8,10),
labels = c('bin_1', 'bin_2', 'bin_3')),
sample
)]
# range sample maxcount sumcount
# 1: bin_1 obs_s1 3 8
# 2: bin_2 obs_s1 5 15
# 3: bin_3 obs_s1 4 7
# 4: bin_1 obs_s2 3 8
# 5: bin_2 obs_s2 4 16
# 6: bin_3 obs_s2 4 7

Multiply multiple columns in a data frame with specific values from another data frame

We could make the lengths of both the datasets same and multiply

out <- setNames(df2$weight, df2$company)[col(df1)] * df1
names(out) <- paste0(names(out), ".weighted")

Or another option is

df1 * split(df2$weight, df2$company)[names(df1)]

Or with match

df2$weight[match(names(df1), df2$company)][col(df1)] * df1

Or using sweep

sweep(df1[df2$company],  2, FUN = `*`, df2$weight)


Related Topics



Leave a reply



Submit