Sum Across Multiple Columns With Dplyr

Sum across multiple columns with dplyr

dplyr >= 1.0.0 using across

sum up each row using rowSums (rowwise works for any aggreation, but is slower)

df %>%
   replace(is.na(.), 0) %>%
   mutate(sum = rowSums(across(where(is.numeric))))

sum down each column

df %>%
   summarise(across(everything(), ~ sum(., is.na(.), 0)))

dplyr < 1.0.0

sum up each row

df %>%
   replace(is.na(.), 0) %>%
   mutate(sum = rowSums(.[1:5]))

sum down each column using superseeded summarise_all:

df %>%
   replace(is.na(.), 0) %>%
   summarise_all(funs(sum))

Conditional sum across multiple columns using dplyr?

Get the data in long format, for each location, season and Species sum the values and remove rows which have 0 values.

library(dplyr)

df %>%
  tidyr::pivot_longer(cols = cat:dog, names_to = 'Species') %>%
  group_by(location, season, Species) %>%
  summarise(value = sum(value)) %>%
  ungroup %>%
  filter(value > 0)

#  location season Species value
#   <chr>    <chr>  <chr>   <dbl>
# 1 A        2      cat         2
# 2 A        3      cat         1
# 3 A        3      dog         1
# 4 A        4      cat         1
# 5 A        4      dog         1
# 6 B        2      dog         1
# 7 B        3      cat         1
# 8 B        3      dog         1
# 9 C        1      cat         1
#10 C        2      cat         1
#11 C        2      dog         1
#12 D        4      cat         1
#13 D        4      dog         2

R: Summing a sequence of columns row-wise with dplyr

You can use rowSums to do that:

# option 1
df_abc %>% mutate(sum_1 = rowSums(.[3:6]))
# option 2
df_abc %>% mutate(sum_1 = rowSums(select(.,orfOiRFj:DJHhhjhF)))

The result:

# A tibble: 100 x 9
   FJDFjdfF FfdfFxfj orfOiRFj xDGHdj jfdIDFF DJHhhjhF KhjhjFlFLF IgiGJIJFG sum_1
      <int>    <int>    <int>  <int>   <int>    <int>      <int>     <int> <dbl>
 1        1        1        1      1       1        1          1         1     4
 2        2        2        2      2       2        2          2         2     8
 3        3        3        3      3       3        3          3         3    12
 4        4        4        4      4       4        4          4         4    16
 5        5        5        5      5       5        5          5         5    20
 6        6        6        6      6       6        6          6         6    24
 7        7        7        7      7       7        7          7         7    28
 8        8        8        8      8       8        8          8         8    32
 9        9        9        9      9       9        9          9         9    36
10       10       10       10     10      10       10         10        10    40
# ... with 90 more rows

sum of multiple columns using group_by function

You can use across() inside summarise instead of summarise_at.

From ?summarize_at we can see that:

Scoped verbs (_if, _at, _all) have been superseded by the use of across() in an existing verb. See vignette("colwise") for details.

library(dplyr)

CPUE_deelgebied %>% 
  group_by(Gebied, Datum) %>%
  summarise(across(c(`Som vangtuigen`, `Som van Aantal`), sum))

# A tibble: 24 x 4
# Groups:   Gebied [2]
   Gebied Datum      `Som vangtuigen` `Som van Aantal`
   <chr>  <date>                <dbl>            <dbl>
 1 Oost   2021-02-25              100             321.
 2 Oost   2021-03-25              100             267 
 3 Oost   2021-05-07               92             139 
 4 Oost   2021-06-02              100             245 
 5 Oost   2021-06-25               96             256.
 6 Oost   2021-07-23              100             213 
 7 Oost   2021-08-27              100             295.
 8 Oost   2021-09-24               96             335.
 9 Oost   2021-10-29               98             264 
10 Oost   2021-11-23              100             151.
# ... with 14 more rows

Add multiple columns with the same group and sum

If I have understood you well, this will solve your problem:

narc_auth_total <- 
  narc_auth %>% 
  group_by(Full.Name) %>% 
  summarise(
    `2019_words` = sum(`2019`),
    `2020_words` = sum(`2020`)
  ) %>% 
  left_join(totaltweetsyear, ., by = "Full.Name")

Mutate across multiple columns using dplyr

Two possibilities using dplyr:

library(dplyr)

mtcars %>% 
  rowwise() %>% 
  mutate(varmean = mean(c_across(mpg:vs)))

This returns

# A tibble: 32 x 12
# Rowwise: 
     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb varmean
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>
 1  21       6  160    110  3.9   2.62  16.5     0     1     4     4    40.0
 2  21       6  160    110  3.9   2.88  17.0     0     1     4     4    40.1
 3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1    31.7
 4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1    52.8
 5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2    73.2
 6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1    47.7
 7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4    81.2
 8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2    33.1
 9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2    36.7
10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4    42.8
# ... with 22 more rows

and without rowwise() and using base Rs rowMeans():

mtcars %>% 
  mutate(varmean = rowMeans(across(mpg:vs)))

returns

                     mpg cyl  disp  hp drat    wt  qsec vs am gear carb  varmean
Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4 39.99750
Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4 40.09938
Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1 31.69750
Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1 52.76687
Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2 73.16375
Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1 47.69250
Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4 81.24000
Merc 240D           24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2 33.12250
Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2 36.69625
Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4 42.80750

summing multiple columns in an R data-frame quickly

Here's an alternative approach using tidyverse:

library(tidyverse)

# input columns of interest
cols = c("mpg", "cyl", "disp", "hp", "drat")

mtcars %>% 
  group_by(id = row_number()) %>%  # for each row
  nest(cols) %>%                   # nest selected columns
  mutate(SUM = map_dbl(data, sum)) # calculate the sum of those columns

# # A tibble: 32 x 3
#      id data               SUM
#   <int> <list>           <dbl>
# 1     1 <tibble [1 x 5]>  301.
# 2     2 <tibble [1 x 5]>  301.
# 3     3 <tibble [1 x 5]>  232.
# 4     4 <tibble [1 x 5]>  398.
# 5     5 <tibble [1 x 5]>  565.
# 6     6 <tibble [1 x 5]>  357.
# 7     7 <tibble [1 x 5]>  631.
# 8     8 <tibble [1 x 5]>  241.
# 9     9 <tibble [1 x 5]>  267.
# 10    10 <tibble [1 x 5]>  320.
# # ... with 22 more rows

The output here is a data frame containing the row id (id), the data used at each row (data) and the calculated sum (SUM).

You can get a vector of the calculated SUM if you add ... %>% pull(SUM).

Sum Across Multiple Columns With Dplyr