Rowsums But Keeping Na Values

rowSums but keeping NA values

If you have a variable number of columns you could try this approach:

mm <- merge(dd1,dd2)
mm$m <- rowSums(mm, na.rm=TRUE) * ifelse(rowSums(is.na(mm)) == ncol(mm), NA, 1)
# or, as @JoshuaUlrich commented:
#mm$m <- ifelse(apply(is.na(mm),1,all),NA,rowSums(mm,na.rm=TRUE))
tail(mm, 10)
#                  dd1        dd2        m
#2013-08-02        NA         NA       NA
#2013-08-03        NA         NA       NA
#2013-08-04        NA         NA       NA
#2013-08-05 1.2542692 -1.2542692 0.000000
#2013-08-06        NA  1.3325804 1.332580
#2013-08-07        NA  0.7726740 0.772674
#2013-08-08 0.8158402 -0.8158402 0.000000
#2013-08-09        NA  1.2292919 1.229292
#2013-08-10        NA         NA       NA
#2013-08-11        NA  0.9334900 0.933490

rowSums with all NA

Here is one option:

rowSums(df, na.rm = TRUE) * NA ^ (rowSums(!is.na(df)) == 0)
# [1]  2  2 NA  1  3  1

This uses that anything ^ 0 equals 1 in R.

error in calculating rowsum of column having NA values

To select specific columns use rowSums in select :

library(dplyr)

df %>% mutate(x1 = ifelse(is.na(T_1_1) & is.na(S_2_1),NA,
                       rowSums(select(., c(T_1_1,S_2_1)),na.rm = TRUE)))

#  T_1_1 T_1_2 T_1_3 S_2_1 S_2_2 S_2_3 T_1_0  x1
#1    68    26    93    69    87   150    79 137
#2    NA    NA    32    67    67     0     0  67
#3     0     0    NA    94    NA    NA     0  94
#4   105    73   103     0   120   121    NA 105
#5    NA    NA    NA    NA    NA    NA    98  NA
#6     0    97     0   136   122    78    NA 136
#7   135    46   147    NA     0   109    15 135
#8    NA    NA    NA    92    NA    NA    NA  92
#9    24     0   139    73    79     0     2  97

Mutate row sum but only if NA count is 2 or less

In base R, we can use rowSums twice, 1st to count sum of values in each row and second to count number of NA's in R.

ifelse(rowSums(is.na(df[-1])) <= 2, rowSums(df[-1], na.rm = TRUE), NA)
#[1] NA 17 29 NA  3 NA

Using dplyr row-wise you can do this as :

library(dplyr)
df %>%
  rowwise() %>%
  mutate(col = ifelse(sum(is.na(c_across(v2:v6))) <= 2, 
                      sum(c_across(v2:v6), na.rm = TRUE), NA))

# A tibble: 6 x 7
#  v1       v2    v3    v4    v5    v6   col
#  <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 A         4     7    NA    NA    NA    NA
#2 B        NA     8     3     3     3    17
#3 C         5     9     5     5     5    29
#4 D         6    NA    NA    NA    NA    NA
#5 E        NA    NA     1     1     1     3
#6 F        NA    NA     4    NA     4    NA

Shortened the code using ifelse suggestion from @rpolicastro.

summing across rows, leaving NAs in R

use this to get total and then cbind it with your dataframe .

apply(df,1,function(x){if (sum(is.na(x)) == length(x)){
    return(NA)
}else{
    sum(x,na.rm = T)
}
    })

ignore NA in dplyr row sum

You could use this:

library(dplyr)
data %>% 
  #rowwise will make sure the sum operation will occur on each row
  rowwise() %>% 
  #then a simple sum(..., na.rm=TRUE) is enough to result in what you need
  mutate(sum = sum(a,b,c, na.rm=TRUE))

Output:

Source: local data frame [4 x 4]
Groups: <by row>

      a     b     c   sum
  (dbl) (dbl) (dbl) (dbl)
1     1     4     7    12
2     2    NA     8    10
3     3     5     9    17
4     4     6    NA    10

RowSums NA + NA gives 0

One option with rowSums would be to get the rowSums with na.rm=TRUE and multiply with the negated (!) rowSums of negated (!) logical matrix based on the NA values after converting the rows that have all NAs into NA (NA^)

rowSums(df, na.rm=TRUE) *NA^!rowSums(!is.na(df))
#[1]  2 NA 10

Sum of two Columns of Data Frame with NA Values

dat$e <- rowSums(dat[,c("b", "c")], na.rm=TRUE)
dat
#   a  b c d e
# 1 1  2 3 4 5
# 2 5 NA 7 8 7

Filter data.frame with all colums NA but keep when some are NA

We can use base R

teste[rowSums(!is.na(teste)) >0,]
#   a  b c
#1  1 NA 1
#3  3  3 3
#4 NA  4 4

Or using apply and any

teste[apply(!is.na(teste), 1, any),]

which can be also used within filter

teste %>%
      filter(rowSums(!is.na(.)) >0)

Or using c_across from dplyr, we can directly remove the rows with all NA

library(dplyr)
teste %>% 
    rowwise %>% 
    filter(!all(is.na(c_across(everything()))))
# A tibble: 3 x 3
# Rowwise: 
#      a     b     c
#  <dbl> <dbl> <dbl>
#1     1    NA     1
#2     3     3     3
#3    NA     4     4

NOTE: filter_all is getting deprecated

How to keep only max value of row and convert other value to NA?

We can use apply to loop over the rows (MARGIN = 1) and replace the values that are not equal to max with NA, assign the transpose back to the original object

df[] <- t(apply(df, 1, function(x) replace(x, x != max(x, na.rm = TRUE), NA)))

Or with rowMaxs

library(matrixStats)
i1 <- !!rowSums(!is.na(df))
df[i1,] <-  replace(df[i1,], df[i1,] != rowMaxs(as.matrix(df[i1,]), 
                na.rm = TRUE)[col(df[i1,])], NA)

Or using dplyr

library(dplyr)
library(purrr)
df %>% 
  mutate(new = reduce(., pmax, na.rm = TRUE)) %>% 
  transmute_at(vars(starts_with('col')), ~ replace(., .!= new, NA))

Rowsums But Keeping Na Values