Subtracting Two Columns to Give a New Column in R

Subtracting two columns to give a new column in R

As @Bryan Hanson was saying in the above comment, your syntax and data organization relates more to a data frame. I would treat your data as a data frame and simply use the syntax you provided earlier:

> data <- data.frame(A = c(1,2,3,4), B = c(2,2,2,2))
> data$C <- (data$A - data$B)
> data
A B C
1 1 2 -1
2 2 2 0
3 3 2 1
4 4 2 2

How to subtract two columns using tidyverse mutate with columns named by external variables

We may use .data to subset the column as a vector. The all_of/any_of are used along with across to loop across the columns

library(dplyr)
iris %>%
mutate(calculation = .data[[var1]] - .data[[var2]])%>%
head

-output

Sepal.Length Sepal.Width Petal.Length Petal.Width Species calculation
1 5.1 3.5 1.4 0.2 setosa 1.6
2 4.9 3.0 1.4 0.2 setosa 1.9
3 4.7 3.2 1.3 0.2 setosa 1.5
4 4.6 3.1 1.5 0.2 setosa 1.5
5 5.0 3.6 1.4 0.2 setosa 1.4
6 5.4 3.9 1.7 0.4 setosa 1.5

Or may also use cur_data()

iris %>%
head %>%
mutate(calculation = cur_data()[[var1]] - cur_data()[[var2]])

-output

 Sepal.Length Sepal.Width Petal.Length Petal.Width Species calculation
1 5.1 3.5 1.4 0.2 setosa 1.6
2 4.9 3.0 1.4 0.2 setosa 1.9
3 4.7 3.2 1.3 0.2 setosa 1.5
4 4.6 3.1 1.5 0.2 setosa 1.5
5 5.0 3.6 1.4 0.2 setosa 1.4
6 5.4 3.9 1.7 0.4 setosa 1.5

Or another option is to pass both the variables in across, and then reduce with -

library(purrr)
iris %>%
head %>%
mutate(calculation = reduce(across(all_of(c(var1, var2))), `-`))

-output

Sepal.Length Sepal.Width Petal.Length Petal.Width Species calculation
1 5.1 3.5 1.4 0.2 setosa 1.6
2 4.9 3.0 1.4 0.2 setosa 1.9
3 4.7 3.2 1.3 0.2 setosa 1.5
4 4.6 3.1 1.5 0.2 setosa 1.5
5 5.0 3.6 1.4 0.2 setosa 1.4
6 5.4 3.9 1.7 0.4 setosa 1.5

Or could convert to symbol and evaluate (!!)

iris %>% 
head %>%
mutate(calculation = !! rlang::sym(var1) - !! rlang::sym(var2))
Sepal.Length Sepal.Width Petal.Length Petal.Width Species calculation
1 5.1 3.5 1.4 0.2 setosa 1.6
2 4.9 3.0 1.4 0.2 setosa 1.9
3 4.7 3.2 1.3 0.2 setosa 1.5
4 4.6 3.1 1.5 0.2 setosa 1.5
5 5.0 3.6 1.4 0.2 setosa 1.4
6 5.4 3.9 1.7 0.4 setosa 1.5

Or if we want to use all_of in across, just subset the column with [[

iris %>% 
head %>%
mutate(calculation = across(all_of(var1))[[1]] -
across(all_of(var2))[[1]])
Sepal.Length Sepal.Width Petal.Length Petal.Width Species calculation
1 5.1 3.5 1.4 0.2 setosa 1.6
2 4.9 3.0 1.4 0.2 setosa 1.9
3 4.7 3.2 1.3 0.2 setosa 1.5
4 4.6 3.1 1.5 0.2 setosa 1.5
5 5.0 3.6 1.4 0.2 setosa 1.4
6 5.4 3.9 1.7 0.4 setosa 1.5

The reason we need to subset is because, across by default will update the original column when the .names is not present. The calculation will be a data.frame with a single column

out <- iris %>%
head %>%
mutate(calculation = across(all_of(var1)) -
across(all_of(var2)))
out
Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Length
1 5.1 3.5 1.4 0.2 setosa 1.6
2 4.9 3.0 1.4 0.2 setosa 1.9
3 4.7 3.2 1.3 0.2 setosa 1.5
4 4.6 3.1 1.5 0.2 setosa 1.5
5 5.0 3.6 1.4 0.2 setosa 1.4
6 5.4 3.9 1.7 0.4 setosa 1.5

str(out)
data.frame': 6 obs. of 6 variables:
$ Sepal.Length: num 5.1 4.9 4.7 4.6 5 5.4
$ Sepal.Width : num 3.5 3 3.2 3.1 3.6 3.9
$ Petal.Length: num 1.4 1.4 1.3 1.5 1.4 1.7
$ Petal.Width : num 0.2 0.2 0.2 0.2 0.2 0.4
$ Species : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1
$ calculation :'data.frame': 6 obs. of 1 variable:
..$ Sepal.Length: num 1.6 1.9 1.5 1.5 1.4 1.5

R: How to repeatedly subtract specific columns from different series of columns, and output to a new dataframe?

Probably others have better ways - but here is one possibility.

  1. load two libraries and set dfOld to data.table
library(data.table)
library(magrittr)
setDT(dfOld)

  1. get information about the columns, and make into a list.
lv = names(dfOld)[-1][seq(1,ncol(dfOld)-1)%%4>0]
lv = split(lv, ceiling(seq_along(lv)/3))
names(lv) = names(dfOld)[-1][seq(1,ncol(dfOld)-1)%%4==0]

lv looks like this:

> lv
$D
[1] "A" "B" "C"

$H
[1] "E" "F" "G"

  1. This is a bit convoluted, but basically, I'm taking each of the elements of the lv list, and I'm reshaping columns from dfOld, so I can do all subtractions at once. Then I'm retaining only the variables I need, and binding each of the resulting list of data.tables into a single datatable using rbindlist
res =rbindlist(lapply(names(lv), function(x)  {
melt(dfOld,id=c("ID", x),measure.vars = lv[[x]]) %>%
.[,`:=`(nc=value-get(x),variable=paste0(variable,"-",x))] %>%
.[,.(ID,variable,nc)]
}))

  1. Last step is simple - just dcast back
dcast(res,ID~variable, value.var="nc")

Output

    ID A-D B-D C-D E-H F-H G-H
1: 1 -66 -65 -63 -33 2 -30
2: 2 -4 -3 -1 -4 -3 -1
3: 3 -4 -3 -1 34 -3 -1
4: 4 3 0 0 3 0 0
5: 5 3 3 3 3 47 3
6: 6 1 0 -4 1 0 -4
7: 7 0 -6 -2 0 -6 -2
8: 8 -8 -2 -5 -8 -2 -5
9: 9 -69 -78 -72 -69 -18 -72
10: 10 5 1 6 5 1 6

Iterative function to subtract columns from a specific column in a dataframe and have the values appear in a new column

This will do what you want. Notice that myfun treats the first column as special, as per your example.

# example data
df <- data.frame(
Sample = paste0("s00", 1:4),
g1 = 5:8,
g2 = 10:13,
g3 = 15:18,
g4 = 20:23,
g5 = 25:28,
stringsAsFactors = FALSE
)

# function to do what you want
myfun <- function(x, df) {
mat <- df[[x]] - as.matrix(df[ , names(df)[-1]]) #subtract all cols from x
colnames(mat) <- paste0(names(df)[-1], "dt") #give these new cols names
df <- cbind(df, mat) #add new cols to dataframe
df <- df[ , c(1, order(names(df)[-1])+1)] #reorder cols
return(df)
}

# test it
myfun("g3", df)

# result
Sample g1 g1dt g2 g2dt g3 g3dt g4 g4dt g5 g5dt
1 s001 5 10 10 5 15 0 20 -5 25 -10
2 s002 6 10 11 5 16 0 21 -5 26 -10
3 s003 7 10 12 5 17 0 22 -5 27 -10
4 s004 8 10 13 5 18 0 23 -5 28 -10

Subtract columns in R

You might want something like

# this is your data
df <- data.frame("2014"=c(100,54,200),
"2015"=c(200,2,221),
"2016"=c(342,523,225),
"2017"=c(532,121,229),
"2018"=c(65,200,250),
row.names=c("Brad","Tom","Uma"))

df
# X2014 X2015 X2016 X2017 X2018
# Brad 100 200 342 532 65
# Tom 54 2 523 121 200
# Uma 200 221 225 229 250

...and then you do this

df[,-1] - df[,-5]
# X2015 X2016 X2017 X2018
# Brad 100 142 190 -467
# Tom -52 521 -402 79
# Uma 21 4 4 21

The df[,-1] gives every column of your data.frame except the first one, and the df[,-5] gives every column except the fifth... from there, it's simple subtraction.

Subtracting values in one column based on two other columns in R

We can do a group by diff with a condition to check with an if/else

library(dplyr)
df %>%
group_by(site, ID) %>%
summarise(diff = if(length(cover) == 1) NA
else diff(cover), .groups = 'drop')

-output

# A tibble: 4 x 3
site ID diff
<chr> <chr> <dbl>
1 blue C NA
2 green B -5
3 red A -5
4 red B -5

OP's original code is not working because one of the grouping doesn't have both 'after' and 'before' in 'time'. We may need to specify an if/else condition for that

How do you subtract two columns in a data frame IF they exist in the data frame?

It's always easier with sample data, but let's make a little example here that matches the description of your data:

df <- data.frame(columnxbeg = 1:5, columnxend = 6:10,
columnybeg = 2:6, columnyend = 8:12)

df
#> columnxbeg columnxend columnybeg columnyend
#> 1 1 6 2 8
#> 2 2 7 3 9
#> 3 3 8 4 10
#> 4 4 9 5 11
#> 5 5 10 6 12

To do this in a single pipeline, we need to find the columns that have the suffix "beg" and the columns with the suffix "end", ensure they are in the correct order, subtract them, and bind them onto the existing data:

library(tidyverse)

df %>%
bind_cols(((df %>%
select(ends_with("beg")) %>%
select(order(names(.)))) -
(df %>%
select(ends_with("end")) %>%
select(order(names(.))))) %>%
rename_with(~str_replace(.x, "beg", "diff")))
#> columnxbeg columnxend columnybeg columnyend columnxdiff columnydiff
#> 1 1 6 2 8 -5 -6
#> 2 2 7 3 9 -5 -6
#> 3 3 8 4 10 -5 -6
#> 4 4 9 5 11 -5 -6
#> 5 5 10 6 12 -5 -6

This will work with any number of columns, as long as they have a consistent naming pattern.

Created on 2022-06-06 by the reprex package (v2.0.1)

subtract columns in the datatable in pairs

You can divide the dataframe in half and subtract the second part with the first one and assign new columns names.

n <- ncol(df)
col1 <- 1:(n/2)
col2 <- (n/2 + 1):n
new_col_name <- paste(names(df)[col2], names(df)[col1], sep = '-')
df[new_col_name] <- df[col2] - df[col1]
head(df)

# h1 w1 e1 h2 w2 e2 h2-h1 w2-w1 e2-e1
#1 49.43 149.6 150.2 49.39 149.4 150.1 -0.03665 -0.193458 -0.09741
#2 50.10 149.7 150.8 49.03 149.6 149.6 -1.07812 -0.053813 -1.25975
#3 50.05 149.8 150.7 48.42 149.8 151.0 -1.62448 -0.007319 0.32304
#4 49.77 149.7 148.8 49.92 148.7 149.1 0.15132 -1.005730 0.23139
#5 49.44 149.9 151.0 48.39 150.9 150.0 -1.04673 0.977863 -0.97748
#6 49.58 148.8 151.1 50.41 150.6 148.6 0.83088 1.800697 -2.52930


Related Topics



Leave a reply



Submit