Subtract a column in a dataframe from many columns in R
If you need to subtract the columns 3:ncol(df)
from the second column
df[3:ncol(df)] <- df[3:ncol(df)]-df[,2]
How to subtract one column from multiple columns in a dataframe in R using dplyr
It is a behavior of mutate_at
, you could switch to across
(as suggested by @RonakShah) and do:
gapminder %>%
select(country, year, gdpPercap) %>%
pivot_wider(names_from = country, values_from = gdpPercap) %>%
arrange(year) %>%
mutate(across(-matches('year'), ~ . - India)) %>%
select(year, India, Vietnam)
With mutate_at
, you would need to make sure that the column used for calculation is the last one in your data - you could use relocate
to move it, like below:
gapminder %>%
select(country, year, gdpPercap) %>%
pivot_wider(names_from = country, values_from = gdpPercap) %>%
arrange(year) %>%
relocate(India, .after = last_col()) %>%
mutate_at(vars(-matches('year')), ~ . - India) %>%
select(year, India, Vietnam)
Output:
# A tibble: 12 x 3
year India Vietnam
<int> <dbl> <dbl>
1 1952 0 58.5
2 1957 0 86.2
3 1962 0 114.
4 1967 0 -63.6
5 1972 0 -24.5
6 1977 0 -99.8
7 1982 0 -148.
8 1987 0 -156.
9 1992 0 -175.
10 1997 0 -72.9
11 2002 0 17.7
12 2007 0 -10.6
Subtract multiple column in the same data frame in R
Match the prefixes for the data and the subtraction part, and then subtract:
subsel <- endsWith(names(mydata), "_c3")
prefix <- sub("_.+", "", names(mydata))
mydata - mydata[subsel][match(prefix, prefix[subsel])]
# x1_c1 x2_c1 x3_c1 x4_c1 x1_c2 x2_c2 x3_c2 x4_c2 x1_c3 x2_c3 x3_c3 x4_c3
#1 0 0 0 0 -1 -2 -3 -4 0 0 0 0
#2 0 0 0 0 -2 -3 -4 -5 0 0 0 0
#3 0 0 0 0 -3 -4 -5 -6 0 0 0 0
#4 0 0 0 0 -4 -5 -6 -7 0 0 0 0
#5 0 0 0 0 -5 -6 -7 -8 0 0 0 0
Or if you want to live on the edge and you are sure your data is complete and sorted as expected:
mydata - as.matrix(mydata[,endsWith(names(mydata), "_c3")])
R: How to repeatedly subtract specific columns from different series of columns, and output to a new dataframe?
Probably others have better ways - but here is one possibility.
- load two libraries and set
dfOld
to data.table
library(data.table)
library(magrittr)
setDT(dfOld)
- get information about the columns, and make into a list.
lv = names(dfOld)[-1][seq(1,ncol(dfOld)-1)%%4>0]
lv = split(lv, ceiling(seq_along(lv)/3))
names(lv) = names(dfOld)[-1][seq(1,ncol(dfOld)-1)%%4==0]
lv
looks like this:
> lv
$D
[1] "A" "B" "C"
$H
[1] "E" "F" "G"
- This is a bit convoluted, but basically, I'm taking each of the elements of the
lv
list, and I'm reshaping columns fromdfOld
, so I can do all subtractions at once. Then I'm retaining only the variables I need, and binding each of the resulting list of data.tables into a single datatable usingrbindlist
res =rbindlist(lapply(names(lv), function(x) {
melt(dfOld,id=c("ID", x),measure.vars = lv[[x]]) %>%
.[,`:=`(nc=value-get(x),variable=paste0(variable,"-",x))] %>%
.[,.(ID,variable,nc)]
}))
- Last step is simple - just
dcast
back
dcast(res,ID~variable, value.var="nc")
Output
ID A-D B-D C-D E-H F-H G-H
1: 1 -66 -65 -63 -33 2 -30
2: 2 -4 -3 -1 -4 -3 -1
3: 3 -4 -3 -1 34 -3 -1
4: 4 3 0 0 3 0 0
5: 5 3 3 3 3 47 3
6: 6 1 0 -4 1 0 -4
7: 7 0 -6 -2 0 -6 -2
8: 8 -8 -2 -5 -8 -2 -5
9: 9 -69 -78 -72 -69 -18 -72
10: 10 5 1 6 5 1 6
R - Subtract the same value from multiple columns
Maybe try this using across()
:
library(dplyr)
#Data
new <- likert_data <- data.frame(id=c(1:10),
a=sample(x = 1:5, size = 10,replace=T),
b=sample(x = 1:5, size = 10,replace=T),
c=sample(x = 1:5, size = 10,replace=T)
)
#Code
new <- likert_data %>% mutate(across(a:c,~.-3))
Output:
id a b c
1 1 2 -2 1
2 2 2 -2 0
3 3 -2 -1 -2
4 4 0 0 -1
5 5 2 -2 2
6 6 1 2 2
7 7 0 0 1
8 8 0 1 -2
9 9 0 -1 -1
10 10 2 0 -2
Subtracting columns of data frame by name
Try this base R
solution without loop. Just have in mind the position of columns:
#Data
df <- as.data.frame(matrix(seq(1,20,1),nrow=4), byrow=TRUE)
colnames(df) <- c("X1","X2","X3","X4","X5")
rownames(df) <- as.Date(c("2020-01-02","2020-01-03","2020-01-04","2020-01-05"))
#Set columns for difference
df[,2:5] <- df[,2:5]-df[,1]
Output:
X1 X2 X3 X4 X5
2020-01-02 1 4 8 12 16
2020-01-03 2 4 8 12 16
2020-01-04 3 4 8 12 16
2020-01-05 4 4 8 12 16
Or a more sophisticated way would be:
#Create index
#Var to substract
i1 <- which(names(df)=='X1')
#Vars to be substracted with X1
i2 <- which(names(df)!='X1')
#Compute
df[,i2]<-df[,i2]-df[,i1]
Output:
X1 X2 X3 X4 X5
2020-01-02 1 4 8 12 16
2020-01-03 2 4 8 12 16
2020-01-04 3 4 8 12 16
2020-01-05 4 4 8 12 16
Subtract a column of dates with other columns in R
library(tidyverse)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
df <- read_table("PID Sub D1 D2 D3
123 2015-02-26 2018-04-26 2015-02-26 2014-05-29
345 2014-03-11 2019-05-18 NA 2012-08-11
678 2016-01-22 2017-11-20 2016-01-21 NA
987 2020-06-15 NA 2018-08-19 2019-01-15")
df %>%
mutate(across(D1:D3, ~ .x - Sub))
#> # A tibble: 4 × 5
#> PID Sub D1 D2 D3
#> <dbl> <date> <drtn> <drtn> <drtn>
#> 1 123 2015-02-26 1155 days 0 days -273 days
#> 2 345 2014-03-11 1894 days NA days -577 days
#> 3 678 2016-01-22 668 days -1 days NA days
#> 4 987 2020-06-15 NA days -666 days -517 days
Created on 2022-06-29 by the reprex package (v2.0.1)
Iterative function to subtract columns from a specific column in a dataframe and have the values appear in a new column
This will do what you want. Notice that myfun
treats the first column as special, as per your example.
# example data
df <- data.frame(
Sample = paste0("s00", 1:4),
g1 = 5:8,
g2 = 10:13,
g3 = 15:18,
g4 = 20:23,
g5 = 25:28,
stringsAsFactors = FALSE
)
# function to do what you want
myfun <- function(x, df) {
mat <- df[[x]] - as.matrix(df[ , names(df)[-1]]) #subtract all cols from x
colnames(mat) <- paste0(names(df)[-1], "dt") #give these new cols names
df <- cbind(df, mat) #add new cols to dataframe
df <- df[ , c(1, order(names(df)[-1])+1)] #reorder cols
return(df)
}
# test it
myfun("g3", df)
# result
Sample g1 g1dt g2 g2dt g3 g3dt g4 g4dt g5 g5dt
1 s001 5 10 10 5 15 0 20 -5 25 -10
2 s002 6 10 11 5 16 0 21 -5 26 -10
3 s003 7 10 12 5 17 0 22 -5 27 -10
4 s004 8 10 13 5 18 0 23 -5 28 -10
Related Topics
Calculate Cumulative Average (Mean)
Filter Data Frame Rows Based on Values in Vector
Remove Backslashes from Character String
How to Change Library Location in R
Combining Bar and Line Chart (Double Axis) in Ggplot2
R - Add Column That Counts Sequentially Within Groups But Repeats for Duplicates
Display Custom Image as Geom_Point
Reverse Order of Discrete Y Axis in Ggplot2
Making a Stacked Bar Plot for Multiple Variables - Ggplot2 in R
Floating Point Less-Than-Equal Comparisons After Addition and Substraction
Explain Ggplot2 Warning: "Removed K Rows Containing Missing Values"
Rolling Join on Data.Table with Duplicate Keys