Sum Columns by Group (Row Names) in a Matrix

Sum columns by group (row names) in a matrix

Here's a vectorized base solution

rowsum(df, row.names(x))
# Mon Tue Wed Thurs
# Cake 2 1 1 2
# Pie 0 0 3 3

Or data.table version using keep.rownames = TRUE in order to convert your row names to a column

library(data.table)
as.data.table(x, keep.rownames = TRUE)[, lapply(.SD, sum), by = rn]
# rn Mon Tue Wed Thurs
# 1: Cake 2 1 1 2
# 2: Pie 0 0 3 3

row and column matrix sum in R by group

We can do the sum with xtabs after changing the dimnames with the substr of 1st 4 characters

dimnames(m1) <- lapply(dimnames(m1), substr, 1, 4)
xtabs(Freq~ Var1 + Var2, as.data.frame.table(m1))
# Var2
#Var1 UKC1 UKC2
# UKC1 14 22
# UKC2 46 54

data

m1 <- structure(c(1L, 5L, 9L, 13L, 2L, 6L, 10L, 14L, 3L, 7L, 11L, 15L, 
4L, 8L, 12L, 16L), .Dim = c(4L, 4L), .Dimnames = list(c("UKC1_SS1",
"UKC1_SS2", "UKC2_SS1", "UKC2_SS2"), c("UKC1_SS1", "UKC1_SS2",
"UKC2_SS1", "UKC2_SS1.1")))

Sum row-wise values that are grouped by column name but keep all columns in R?

You can try ave like below (with aids of col + row)

> ave(myMat,colnames(myMat)[col(myMat)], row(myMat), FUN = sum)
x y x y
[1,] 1 3 1 3
[2,] 5 9 5 9
[3,] 4 13 4 13

Sum values in rows with same names in R

We can use rowsum. Assume that the dataset showed is matrix and not data.frame as data.frame cannot have duplicated row names

rowsum(df1, row.names(df1))

Or using aggregate

aggregate(df1, list(row.names(df1)), sum)

data

df1 <- structure(c(5L, 3L, 7L, 1L, 3L, 6L, 6L, 4L, 2L, 7L), .Dim = c(5L, 
2L), .Dimnames = list(c("bacteria", "bacteria", "bacteria", "archaea",
"archaea"), c("category1", "category2")))

How to calculate sum of values in each column based on row names in R?

We can use colSums with startsWith

colSums(mat[startsWith(row.names(mat), "A"),])

Calculate summary statistics for each row of a matrix based on columns grouped by column names

This is the best scenario to use tapply:

tapply(t(data), list(col(data), array(colnames(data), dim(t(data)))), mean)
A B
1 3 8
2 13 18
3 23 28
4 33 38
5 43 48
6 53 58
7 63 68
8 73 78
9 83 88
10 93 98

tapply(data, list(t(colnames(data))[rep(1,nrow(data)), ], row(data)), mean)
1 2 3 4 5 6 7 8 9 10
A 3 13 23 33 43 53 63 73 83 93
B 8 18 28 38 48 58 68 78 88 98

tapply(t(data), interaction(colnames(data), col(data)), mean)
A.1 B.1 A.2 B.2 A.3 B.3 A.4 B.4 A.5 B.5 A.6 B.6 A.7 B.7 A.8 B.8 A.9 B.9 A.10 B.10
3 8 13 18 23 28 33 38 43 48 53 58 63 68 73 78 83 88 93 98

More base R solutions:

sapply(split.default(data.frame(data), colnames(data)), rowMeans)
A B
[1,] 3 8
[2,] 13 18
[3,] 23 28
[4,] 33 38
[5,] 43 48
[6,] 53 58
[7,] 63 68
[8,] 73 78
[9,] 83 88
[10,] 93 98

data.frame(data) |>
reshape(split(1:ncol(data), colnames(data)), dir = 'long') |>
(\(x)aggregate(.~id, x, mean))()

id time A B
1 1 3 3 8
2 2 3 13 18
3 3 3 23 28
4 4 3 33 38
5 5 3 43 48
6 6 3 53 58
7 7 3 63 68
8 8 3 73 78
9 9 3 83 88
10 10 3 93 98

R sum rows of matrix by column name

You can use rowsum with the column names as group variable:

t(rowsum(t(z), colnames(z)))

# a b c
#[1,] 8 20 9
#[2,] 11 7 3
#[3,] 8 18 8
#[4,] 8 11 10

Row-wise sum of values grouped by columns with same name

We can transpose dat , calculate rowsum per group (colnames of the original dat), then transpose the result back to original structure.

t(rowsum(t(dat), group = colnames(dat), na.rm = T))
# A C G T
#1 1 0 1 0
#2 4 0 6 0
#3 0 1 0 1
#4 2 0 1 0
#5 1 0 1 0
#6 0 1 0 1
#7 0 1 0 1


Related Topics



Leave a reply



Submit