Mutate Multiple/Consecutive Columns (With Dplyr or Base R)

Mutate multiple / consecutive columns (with dplyr or base R)

Here is one way with the package zoo:

library(zoo)
t(rollapply(t(df), width = 10, by = 10, function(x) sum(x)/10))

Here is one way to do it with base R:

splits <- 1:100
dim(splits) <- c(10, 10)
splits <- split(splits, col(splits))
results <- do.call("cbind", lapply(splits, function(x) data.frame(rowSums(df[,x] / 10))))
names(results) <- paste0("wave_", 1:10)
results

Another very succinct way with base R (courtesy of G.Grothendieck):

t(apply(df, 1, tapply, gl(10, 10), mean))

And here is a solution with dplyr and tidyr:

library(dplyr)
library(tidyr)
df$row <- 1:nrow(df)
df2 <- df %>% gather(column, value, -row)
df2$column <- cut(as.numeric(gsub("X", "", df2$column)),breaks = c(0:10*10))
df2 <- df2 %>% group_by(row, column) %>% summarise(value = sum(value)/10)
df2 %>% spread(column, value) %>% select(-row)

Mutate multiple / consecutive columns (with dplyr)

If we are using rowSums, it can be used directly within mutate. Also, as this is doing the sum on each row, the group_by, is not needed. The distinct part without .keep_all = TRUE returns only the distinct rows of 'DESCRIPTION' column.

library(dplyr)
df1 %>%

      mutate(Total = rowSums(.[4:17], na.rm = TRUE)) %>%
      group_by(`ITEM#`) %>%
      mutate(Total = sum(Total, na.rm = TRUE))

NOTE: By checking the 'DESCRIPTION' from the image, all the elements are unique, so distinct is not needed

Mutate multiple columns with conditions using dplyr

Although I prefer a solution with all variables in one column as suggested by @Patrick (although I would use something like %>% mutate(new_col = case_when(etc...)), here a way with for-loop

# I changed your data a tiny bit
df <- tibble("a" = sample(1990:2000, size = 10),  # better to use 'sample' then 'runif' !
             "event" = 1995) %>% mutate("relative_event" = a - event)

Now the actual work

for (i in min(df$relative_event):max(df$relative_event)) {

# the indexing value is your difference in years. So you have to run the index from the lowest difference to the highest. 

if( i < 0 ) {
  df[[paste0('event_b', abs(i))]] <- ifelse(i == df$relative_event, 1, 0)
  } 
  if( i >= 0 ) {
    df[[paste0('event_f', abs(i))]] <- ifelse(i == df$relative_event, 1, 0)
    df
  } 
}  

# A tibble: 10 x 14
       a event relative_event event_b5 event_b4 event_b3 event_b2 event_b1
   <int> <dbl>          <dbl>    <dbl>    <dbl>    <dbl>    <dbl>    <dbl>
 1  1990  1995             -5        1        0        0        0        0
 2  1992  1995             -3        0        0        1        0        0
 3  1991  1995             -4        0        1        0        0        0
 4  2000  1995              5        0        0        0        0        0
 5  1998  1995              3        0        0        0        0        0
 6  1993  1995             -2        0        0        0        1        0
 7  1996  1995              1        0        0        0        0        0
 8  1997  1995              2        0        0        0        0        0
 9  1994  1995             -1        0        0        0        0        1
10  1999  1995              4        0        0        0        0        0
# ... with 6 more variables: event_f0 <dbl>, event_f1 <dbl>, event_f2 <dbl>,
#   event_f3 <dbl>, event_f4 <dbl>, event_f5 <dbl>

If you don't want to run through every possible difference in years - (this will create 'empty' columns) - you could simply create a vector with unique(df$relative_event) and run i through this vector

Mutating multiple columns in a data frame using dplyr

You are really close.

df2 <- 
    df %>% 
    mutate(v1v3 = v1 * v3,
           v2v4 = v2 * v4)

such a beautifully simple language, right?

For more great tricks please see here.

EDIT:
Thanks to @Facottons pointer to this answer: https://stackoverflow.com/a/34377242/5088194, here is a tidy approach to resolving this issue. It keeps one from having to write a line to hard code in each new column desired. While it is a bit more verbose than the Base R approach, the logic is at least more immediately transparent/readable. It is also worth noting that there must be at least half as many rows as there are columns for this approach to work.

# prep the product column names (also acting as row numbers)
df <- 
    df %>%
    mutate(prod_grp = paste0("v", row_number(), "v", row_number() + 2)) 

# converting data to tidy format and pairing columns to be multiplied together.
tidy_df <- 
    df %>%
    gather(column, value, -prod_grp) %>% 
    mutate(column = as.numeric(sub("v", "", column)),
           pair = column - 2) %>% 
    mutate(pair = if_else(pair < 1, pair + 2, pair))

# summarize the products for each column
prod_df <- 
    tidy_df %>% 
    group_by(prod_grp, pair) %>% 
    summarize(val = prod(value)) %>% 
    spread(prod_grp, val) %>% 
    mutate(pair = paste0("v", pair, "v", pair + 2)) %>% 
    rename(prod_grp = pair)

# put the original frame and summary frames together
final_df <- 
    df %>% 
    left_join(prod_df) %>% 
    select(-prod_grp)

How to select consecutive columns with across function dplyr

In across(), there are two basic arguments. The first argument are the columns that are to be modified, while the second argument is the function which should be applied to the columns. In addition, vars() is no longer needed to select the variables. Thus, the correct form is:

d %>%
 mutate(across(V1:V4, ~ replace(., is.na(.), 0)))

   V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1   2  6  0  6  5  6 10  5  3   1
2   2  9  2  4 10  6  9  4 NA  NA
3   5  5  3  0  3  7  1  5  9   5
4   7  1  1  6  2  1  8 NA  8   4
5   3  5  3  0  2  3  4  2  3  NA
6   0 10  0  2  5 10  1 10  4   3
7   4  3 10  6 NA  5  9  3  3   9
8   9  9  8  5  8  1  3  1 NA  10
9   6  3  0  1  1  9  3  5  8   4
10  3  2  9  1  5  2  4 NA  6   1

Mutate across multiple columns using dplyr

Two possibilities using dplyr:

library(dplyr)

mtcars %>% 
  rowwise() %>% 
  mutate(varmean = mean(c_across(mpg:vs)))

This returns

# A tibble: 32 x 12
# Rowwise: 
     mpg   cyl  disp    hp  drat    wt  qsec    vs    am  gear  carb varmean
   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>   <dbl>
 1  21       6  160    110  3.9   2.62  16.5     0     1     4     4    40.0
 2  21       6  160    110  3.9   2.88  17.0     0     1     4     4    40.1
 3  22.8     4  108     93  3.85  2.32  18.6     1     1     4     1    31.7
 4  21.4     6  258    110  3.08  3.22  19.4     1     0     3     1    52.8
 5  18.7     8  360    175  3.15  3.44  17.0     0     0     3     2    73.2
 6  18.1     6  225    105  2.76  3.46  20.2     1     0     3     1    47.7
 7  14.3     8  360    245  3.21  3.57  15.8     0     0     3     4    81.2
 8  24.4     4  147.    62  3.69  3.19  20       1     0     4     2    33.1
 9  22.8     4  141.    95  3.92  3.15  22.9     1     0     4     2    36.7
10  19.2     6  168.   123  3.92  3.44  18.3     1     0     4     4    42.8
# ... with 22 more rows

and without rowwise() and using base Rs rowMeans():

mtcars %>% 
  mutate(varmean = rowMeans(across(mpg:vs)))

returns

                     mpg cyl  disp  hp drat    wt  qsec vs am gear carb  varmean
Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4 39.99750
Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4 40.09938
Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1 31.69750
Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1 52.76687
Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2 73.16375
Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1 47.69250
Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4 81.24000
Merc 240D           24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2 33.12250
Merc 230            22.8   4 140.8  95 3.92 3.150 22.90  1  0    4    2 36.69625
Merc 280            19.2   6 167.6 123 3.92 3.440 18.30  1  0    4    4 42.80750

R: mutate over multiple columns to create a new column

Test2<- Test %>%
dplyr::select(starts_with("Test"))%>%
mutate_all(function(x){x %in% c("DF60","DF61","DF62","DF63")})%>%
mutate(out = ifelse(rowSums(.)<1,0,1))

Adjustment after comment

If you want to keep other columns, mutate_at, as is proposed by yutannihilation, is far better. The problem then becomes doing the rowsums in mutate on a selection of the columns. No idea if the next thing is best practice, but it works (reworked an answer on a previous question of mine: dplyr mutate on column subset (one function on all these columns combined))

library(tidyverse)
library(anomalyDetection)

Test1<-c("DF64", "DF63", "DF89", "DF30", "DF70")
Test2<-c("DF61", "DF25", "DF00", "DF30", "DF99")
Test3<-c("DF80", "DF63", "DF60", "DF63", "DF70")
Test<-data.frame(Test1, Test2, Test3)

Test$ExtraCol<-LETTERS[1:5]

Test2<- Test %>%
  mutate_at(vars(starts_with("Test")),funs(bin=.%in% c("DF60","DF61","DF62","DF63")))%>%
  split(.,1<10)%>%
  map_df(~mutate(.,out=rowSums(.[paste0("Test",1:3,"_bin")])>0))

  Test1 Test2 Test3 ExtraCol Test1_bin Test2_bin Test3_bin   out
   DF64  DF61  DF80        A     FALSE      TRUE     FALSE  TRUE
   DF63  DF25  DF63        B      TRUE     FALSE      TRUE  TRUE
   DF89  DF00  DF60        C     FALSE     FALSE      TRUE  TRUE
   DF30  DF30  DF63        D     FALSE     FALSE      TRUE  TRUE
   DF70  DF99  DF70        E     FALSE     FALSE     FALSE FALSE

Sum across multiple columns with dplyr

dplyr >= 1.0.0 using across

sum up each row using rowSums (rowwise works for any aggreation, but is slower)

df %>%
   replace(is.na(.), 0) %>%
   mutate(sum = rowSums(across(where(is.numeric))))

sum down each column

df %>%
   summarise(across(everything(), ~ sum(., is.na(.), 0)))

dplyr < 1.0.0

sum up each row

df %>%
   replace(is.na(.), 0) %>%
   mutate(sum = rowSums(.[1:5]))

sum down each column using superseeded summarise_all:

df %>%
   replace(is.na(.), 0) %>%
   summarise_all(funs(sum))