Summing across rows of a data.table for specific columns
[ Edited 2020-02-15 to reflect current state of data.table
] In recent versions of data.table
rowSums(Abundance[ , 4:6])
works as OP originally expected. Here are some alternatives:
Abundance[, SumAbundance := rowSums(.SD), .SDcols = 4:6]
Also, I didn't check, but I have a suspicion this will be faster, since it will not convert to matrix
as rowSums
does:
Abundance[, SumAbundance := Reduce(`+`, .SD), .SDcol = 4:6]
Summing across rows of a data.table for specific columns with NA
We can have several options for this i.e. either do the rowSums
first and then replace
the rows where all are NA
or create an index in i
to do the sum only for those rows with at least one non-NA.
library(data.table)
TEST[, SumAbundance := replace(rowSums(.SD, na.rm = TRUE),
Reduce(`&`, lapply(.SD, is.na)), NA), .SDcols = 4:6]
Or slightly more compact option
TEST[, SumAbundance := (NA^!rowSums(!is.na(.SD))) *
rowSums(.SD, na.rm = TRUE), .SDcols = 4:6]
Or construct a function and reuse
rowSums_new <- function(dat) {
fifelse(rowSums(is.na(dat)) != ncol(dat), rowSums(dat, na.rm = TRUE), NA_real_)
}
TEST[, SumAbundance := rowSums_new(.SD), .SDcols = 4:6]
Row sums over columns with a certain pattern in their name
You may also try with Reduce
DT[, Sum := Reduce(`+`, .SD), .SDcols=listCol][]
# ref nb i1 i2 i3 i4 Sum
#1: 3 12 0.000031 0.000183 0.000824 0.044495 0.045533
#2: 3 13 0.044495 0.155732 0.533939 0.822440 1.556606
#3: 3 14 0.822440 0.873416 0.838542 0.322291 2.856689
#4: 3 15 0.322291 0.648545 0.990648 0.393595 2.355079
NOTE: If there are "NA" values, it should be replaced with '0' before Reduce
i.e.
DT[, Sum := Reduce(`+`, lapply(.SD, function(x) replace(x,
which(is.na(x)), 0))), .SDcols=listCol][]
**Another solution :**using rowSums
DT[, Sum := rowSums(.SD, na.rm = TRUE), .SDcols = grep("i", names(DT))]
data.table sum of all colums by group
I think the code you're looking for is likely:
TestData[, .(a = sum(.SD)), by = .(id, year), .SDcols = Kattegori_Henter("Medicine")]
R data.table calculate sum of other rows
cols = c('high', 'low')
lapply(
seq_len(nrow(df)),
\(i) matrix(c(unlist(df[i, cols]), colSums(df[-i, cols])), nrow = 2, byrow=TRUE)
)
[[1]]
[,1] [,2]
[1,] 73 77
[2,] 200 218
[[2]]
[,1] [,2]
[1,] 113 155
[2,] 160 140
[[3]]
[,1] [,2]
[1,] 87 63
[2,] 186 232
Data
df = data.frame(genotypes = c('A|A', 'A|G', 'G|G'), high = c(73, 113, 87), low = c(77, 155, 63))
Sum rows in columns with column names ending with specific character string (R)
You can use use select
to select columns that ends with "zscore"
and use rowSums
:
library(dplyr)
df1 %>%
group_by(a) %>%
mutate(across(b:d, list(zscore = ~as.numeric(scale(.))))) %>%
ungroup %>%
mutate(total = rowSums(select(., ends_with('zscore'))))
# A tibble: 30 x 8
# a b c d b_zscore c_zscore d_zscore total
# <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 a 7.17 14.8 8.45 0.697 0.101 0.0179 0.816
# 2 a 7.42 19.7 3.97 0.841 1.17 -1.14 0.865
# 3 a 5.78 19.2 9.66 -0.108 1.05 0.332 1.28
# 4 a 5.09 17.7 12.8 -0.508 0.732 1.14 1.36
# 5 a 7.21 12.9 6.24 0.721 -0.329 -0.555 -0.163
# 6 a 2.36 13.7 2.50 -2.09 -0.146 -1.52 -3.76
# 7 a 7.26 10.9 10.7 0.749 -0.774 0.593 0.567
# 8 a 5.45 6.18 12.8 -0.302 -1.80 1.14 -0.965
# 9 b 5.43 18.2 9.55 -0.445 1.12 1.34 2.02
#10 b 4.16 12.1 4.11 -1.06 0.0776 -1.02 -2.01
# … with 20 more rows
Related Topics
How to Specify Command Line Parameters to R-Script in Rstudio
Date Time Conversion and Extract Only Time
Extracting Off-Diagonal Slice of Large Matrix
How to Upload a File to a Server via Ftp Using R
How to Solve Prcomp.Default(): Cannot Rescale a Constant/Zero Column to Unit Variance
Given a 2D Numeric "Height Map" Matrix in R, How to Find All Local Maxima
Using Geo-Coordinates as Vertex Coordinates in the Igraph R-Package
Ggplot Legend Issue W/ Geom_Point and Geom_Text
How to Find the Percentage of Nas in a Data.Frame
R: Plot Multiple Box Plots Using Columns from Data Frame
Using Filter_ in Dplyr Where Both Field and Value Are in Variables
Adding S4 Dispatch to Base R S3 Generic