Subtract Every Column from Each Other Column in a R Data.Table

Subtract every column from each other column in a R data.table

Looping over the combinations within data.table:

comblist <- combn(names(ex)[-5],2,FUN=list)
res2 <- ex[,lapply(comblist,function(x) get(x[1])-get(x[2]))]

setnames(res2,names(res2),sapply(comblist,paste,collapse="_"))

subtract columns in the datatable in pairs

You can divide the dataframe in half and subtract the second part with the first one and assign new columns names.

n <- ncol(df)
col1 <- 1:(n/2)
col2 <- (n/2 + 1):n
new_col_name <- paste(names(df)[col2], names(df)[col1], sep = '-')
df[new_col_name] <- df[col2] - df[col1]
head(df)

# h1 w1 e1 h2 w2 e2 h2-h1 w2-w1 e2-e1
#1 49.43 149.6 150.2 49.39 149.4 150.1 -0.03665 -0.193458 -0.09741
#2 50.10 149.7 150.8 49.03 149.6 149.6 -1.07812 -0.053813 -1.25975
#3 50.05 149.8 150.7 48.42 149.8 151.0 -1.62448 -0.007319 0.32304
#4 49.77 149.7 148.8 49.92 148.7 149.1 0.15132 -1.005730 0.23139
#5 49.44 149.9 151.0 48.39 150.9 150.0 -1.04673 0.977863 -0.97748
#6 49.58 148.8 151.1 50.41 150.6 148.6 0.83088 1.800697 -2.52930

Subtract a column in a dataframe from many columns in R

If you need to subtract the columns 3:ncol(df) from the second column

df[3:ncol(df)] <- df[3:ncol(df)]-df[,2]

Subtract values from columns, based on groupings

d$M1 - ave(d$M1, d$Sample, d$Treatment, FUN = function(x) x[1])
#[1] 0.00 0.00 0.00 0.64 0.00 0.06 0.82 0.10 0.09

For more than one column, try

nm = c("M1")  #Add column names here
sapply(nm, function(s){
d[[s]] - ave(d[[s]], d$Sample, d$Treatment, FUN = function(x) x[1])
})
# M1
# [1,] 0.00
# [2,] 0.00
# [3,] 0.00
# [4,] 0.64
# [5,] 0.00
# [6,] 0.06
# [7,] 0.82
# [8,] 0.10
# [9,] 0.09

The tidyverse equivalent will probably be

d %>% group_by(Sample, Treatment) %>% mutate_at(nm, function(x) x - x[1])

How to subtract one column from multiple columns in a dataframe in R using dplyr

It is a behavior of mutate_at, you could switch to across (as suggested by @RonakShah) and do:

gapminder %>% 
select(country, year, gdpPercap) %>%
pivot_wider(names_from = country, values_from = gdpPercap) %>%
arrange(year) %>%
mutate(across(-matches('year'), ~ . - India)) %>%
select(year, India, Vietnam)

With mutate_at, you would need to make sure that the column used for calculation is the last one in your data - you could use relocate to move it, like below:

gapminder %>% 
select(country, year, gdpPercap) %>%
pivot_wider(names_from = country, values_from = gdpPercap) %>%
arrange(year) %>%
relocate(India, .after = last_col()) %>%
mutate_at(vars(-matches('year')), ~ . - India) %>%
select(year, India, Vietnam)

Output:

# A tibble: 12 x 3
year India Vietnam
<int> <dbl> <dbl>
1 1952 0 58.5
2 1957 0 86.2
3 1962 0 114.
4 1967 0 -63.6
5 1972 0 -24.5
6 1977 0 -99.8
7 1982 0 -148.
8 1987 0 -156.
9 1992 0 -175.
10 1997 0 -72.9
11 2002 0 17.7
12 2007 0 -10.6

R: How to repeatedly subtract specific columns from different series of columns, and output to a new dataframe?

Probably others have better ways - but here is one possibility.

  1. load two libraries and set dfOld to data.table
library(data.table)
library(magrittr)
setDT(dfOld)

  1. get information about the columns, and make into a list.
lv = names(dfOld)[-1][seq(1,ncol(dfOld)-1)%%4>0]
lv = split(lv, ceiling(seq_along(lv)/3))
names(lv) = names(dfOld)[-1][seq(1,ncol(dfOld)-1)%%4==0]

lv looks like this:

> lv
$D
[1] "A" "B" "C"

$H
[1] "E" "F" "G"

  1. This is a bit convoluted, but basically, I'm taking each of the elements of the lv list, and I'm reshaping columns from dfOld, so I can do all subtractions at once. Then I'm retaining only the variables I need, and binding each of the resulting list of data.tables into a single datatable using rbindlist
res =rbindlist(lapply(names(lv), function(x)  {
melt(dfOld,id=c("ID", x),measure.vars = lv[[x]]) %>%
.[,`:=`(nc=value-get(x),variable=paste0(variable,"-",x))] %>%
.[,.(ID,variable,nc)]
}))

  1. Last step is simple - just dcast back
dcast(res,ID~variable, value.var="nc")

Output

    ID A-D B-D C-D E-H F-H G-H
1: 1 -66 -65 -63 -33 2 -30
2: 2 -4 -3 -1 -4 -3 -1
3: 3 -4 -3 -1 34 -3 -1
4: 4 3 0 0 3 0 0
5: 5 3 3 3 3 47 3
6: 6 1 0 -4 1 0 -4
7: 7 0 -6 -2 0 -6 -2
8: 8 -8 -2 -5 -8 -2 -5
9: 9 -69 -78 -72 -69 -18 -72
10: 10 5 1 6 5 1 6

consecutively subtracting columns in data.table

This is how you can do this in a tidy way:

# make it tidy
df2 <- melt(df,
id = "player_id",
variable.name = "column_name",
value.name = "prestige_score")
# extract numbers from column names
df2[, score_number := as.numeric(gsub("prestige_score_", "", column_name))]
# compute differences by player
df2[, diff := prestige_score - shift(prestige_score, n = 1L, type = "lead"),
by = player_id]

# if necessary, reshape back to original format
dcast(df2, player_id ~ score_number, value.var = c("prestige_score", "diff"))


Related Topics



Leave a reply



Submit