Sum Rows in Data.Frame or Matrix

How to sum rows by rows?

This is pretty similar to Jaap's comment, but a little more spelled out and uses the row names explicitly:

mat = as.matrix(dat[, 2:5])
row.names(mat) = dat$MUN
mat = rbind(mat, colSums(mat[c("Angra dos Reis (RJ)", "Areal (RJ)"), ], na.rm = T))
row.names(mat)[nrow(mat)] = "X"
mat
# X1990 X1991 X1992 X1993
# Angra dos Reis (RJ) 11 10 10 10
# Aperibé (RJ) NA NA NA NA
# Araruama (RJ) 12040 14589 14231 14231
# Areal (RJ) NA NA NA 3
# Armação dos Búzios (RJ) NA NA NA NA
# X 11 10 10 13

The result is a matrix, you can convert it back to a data frame if needed:

dat_result = data.frame(MUN = row.names(mat), mat, row.names = NULL)

I dislike the format of your data as a data frame. I would either convert it to a matrix (as above) or convert it to long format with, e.g., tidyr::gather(dat, key = year, value = value, -MUN) and work with it "by group" using data.table or dplyr.


Using this data:

dat = read.table(text = "             MUN          X1990  X1991  X1992 X1993
1 'Angra dos Reis (RJ)' 11 10 10 10
2 'Aperibé (RJ)' NA NA NA NA
3 'Araruama (RJ)' 12040 14589 14231 14231
4 'Areal (RJ)' NA NA NA 3
5 'Armação dos Búzios (RJ)' NA NA NA NA", header= T)

Summing Entries in Multiple Unequally-Sized Data Frames With Some (but not All) Rows and Columns the Same

I think this should work. With row AND column names and one data type, I prefer matrices to data frames, but you can convert the final matrix back to a data frame if you need.

# put things in a list
df_list = list(df1, df2, df3)

# get the complete set of row and column names
all_rows = unique(unlist(lapply(df_list, rownames)))
all_cols = unique(unlist(lapply(df_list, colnames)))

# initialize a final matrix to NA
final_mat = matrix(NA, nrow = length(all_rows), ncol = length(all_cols))
rownames(final_mat) = all_rows
colnames(final_mat) = all_cols

# go through each df in the list
for(i in seq_along(df_list)) {
# set any NAs in the selection to 0
final_mat[rownames(df_list[[i]]), colnames(df_list[[i]])][is.na(final_mat[rownames(df_list[[i]]), colnames(df_list[[i]])])] = 0
# add the data frame to the selection
final_mat[rownames(df_list[[i]]), colnames(df_list[[i]])] = final_mat[rownames(df_list[[i]]), colnames(df_list[[i]])] + as.matrix(df_list[[i]])
}

final_mat
# A B D C E F
# row1 1 7 4 1 2 NA
# row2 2 4 5 NA NA NA
# row3 15 28 6 2 3 2
# row4 4 6 7 NA NA NA
# row5 5 13 8 3 4 NA
# row6 6 8 9 NA NA NA
# row7 7 16 10 4 5 NA
# row8 8 10 11 NA NA NA
# row9 19 27 12 NA NA 4
# row10 21 29 13 NA NA 3
# row11 NA 8 NA 5 6 NA
# row12 13 19 NA NA NA 1

Row Wise Addition in R

If it is just across all columns, then you can just use rowSums.

library(dplyr)

df %>%
mutate(score = rowSums(df))

Or in base R

df$score <- rowSums(df)

Output

  all_d tit_for_tat perm_retal random joss tester tit_for_two score
1 200 199 199 102 199 198 198 1295
2 204 600 600 483 215 599 600 3301
3 204 600 600 113 208 202 600 2527
4 579 440 608 445 481 429 381 3363
5 204 250 245 417 201 251 538 2106
6 208 599 207 453 225 598 303 2593
7 208 600 600 610 639 798 600 4055
8 1807 3288 3059 2623 2168 3075 3220 19240

sum all rows pairwise in two data frames and save to matrix

Make it to a matrix and add up them.

Directly add two data.frame also works as well.

df1 = data.frame(colA = c(30, 3, 15), colB = c(2, 100, 9))
df2 = data.frame(colA = c(10, 0, 55), colB = c(200, 10, 1))
as.matrix(df1)+ as.matrix(df2)
df1+df2

> as.matrix(df1)+ as.matrix(df2)
colA colB
[1,] 40 202
[2,] 3 110
[3,] 70 10

> df1+df2
colA colB
1 40 202
2 3 110
3 70 10

calculate row sum and product in data.frame

Try

 transform(df, sum=rowSums(df), prod=x*y*z)
# x y z sum prod
#1 1 2 3 6 6
#2 2 3 4 9 24
#3 5 1 2 8 10

Or

 transform(df, sum=rowSums(df), prod=Reduce(`*`, df))
# x y z sum prod
#1 1 2 3 6 6
#2 2 3 4 9 24
#3 5 1 2 8 10

Another option would be to use rowProds from matrixStats

 library(matrixStats)
transform(df, sum=rowSums(df), prod=rowProds(as.matrix(df)))

If you are using apply

 df[,c('sum', 'prod')] <-  t(apply(df, 1, FUN=function(x) c(sum(x), prod(x))))
df
# x y z sum prod
#1 1 2 3 6 6
#2 2 3 4 9 24
#3 5 1 2 8 10

Filtering out columns from an R data.frame based on the sum of its values for all rows

We can use select with a condition on the sum i.e. if the sum of that column greater than threshold, then select it

library(dplyr)
subDf <- df %>%
select(where( ~ sum(.) >= pestCutoff))

NOTE: Here we assume that the condition should be applied to all columns and all of them are numeric. If there are other columns types and do not want to select, create a short circuit

subDf <- df %>%
select(where(~ is.numeric(.) && sum(.) >= pestCutoff))

Or in base R with Filter

Filter(function(x) sum(x) >= pestCutoff, df)

Or with colSums

df[colSums(df) >= pestCutoff]


Related Topics



Leave a reply



Submit