sum non NA elements only, but if all NA then return NA
Following the suggestions from other users, I will post the answer to my question. The solution was provided by @sandipan in the comments above:
As noted in the question, if you need to sum the values of one column which contains NAs,there are two good approaches:
1) using ifelse:
A[, (ifelse(all(is.na(col2)), col2[NA_integer_], sum(col2, na.rm = T))),
by = .(col1)]
2) define a function as suggested by @Frank:
suma = function(x) if (all(is.na(x))) x[NA_integer_] else sum(x, na.rm = TRUE)
A[, suma(col2), by = .(col1)]
Note that I added NA_integer_ as @Frank pointed out because I kept getting errors about the types.
Sum/return NA when all values are NA
The function was checking the NA
on the whole dataset columns instead it should be by each column. Here, is an option with across
library(dplyr)
names(y_true_test) <- grep("species", names(df), value = TRUE)
df %>%
group_by(group) %>%
summarise(across(everything(), ~ if(all(is.na(.x))) NA_real_ else
sqrt(sum((.x - y_true_test)^2, na.rm = TRUE)/n())/
(y_true_test[cur_column()]) * 100), .groups = 'drop')
-output
# A tibble: 1 × 4
group species_1 species_2 species_3
<dbl> <dbl> <dbl> <dbl>
1 1 43.0 28.9 NA
If we want to modify the OP's function
estimate <- function(df, y_true, narm=TRUE) {
i1 <- colSums(is.na(df)) == nrow(df)
out <- sqrt(colSums((t(t(df) - y_true_test))^2,
na.rm= narm) / 3) / y_true_test * 100
out[i1] <- NA
out
}
-testing
> df %>%
+ group_by(group) %>%
+ group_modify( ~ as.data.frame.list(estimate(.,
y_true_test)))
# A tibble: 1 × 4
# Groups: group [1]
group species_1 species_2 species_3
<dbl> <dbl> <dbl> <dbl>
1 1 43.0 28.9 NA
Sum the last n non NA values in each column of a matrix in R
You can use apply
with tail
to sum up the last non NA
like:
apply(x, 2, function(x) sum(tail(x[!is.na(x)], 3)))
#x1 x2 x3 x4 x5
#15 11 9 6 3
Classic case of `sum` returning NA because it doesn't sum NAs
Following Joshua Ulrich's comment, before saying that you have some overflow problem, you should answer these questions:
- How many elements are you summing? R can handle a BIG number of entries
- How big are the values in your vectors? Again, R can handle quite big numbers
- Are you summing integers or floats? If you are summing floating-point numbers, you can't have an integer overflow (floats are not integers)
- Do you have
NA
s in your data? If you sum anything withNA
s present, the result will beNA
, unless you handle it properly.
That said, some solutions:
- Use
sum(..., na.rm=T)
to ignoreNA
s from your object (this is the simple solution) - Sum only non
NA
entries:sum(yourVector[!is.na(yourVector)]
(the not so simple one) - If you are summing a column from a data frame, subset the data frame before summing:
sum(subset(yourDataFrame, !is.na(columnToSum))[columnToSum])
(this is like using a cannon to kill a mosquito)
R: apply statement to take the sum of the number of non-NA values across multiple columns
Just use is.na
and rowSums
:
z <- rowSums(!is.na(y[,paste("diag", 1:11, sep="")]))
Count non-NA values by group
You can use this
mydf %>% group_by(col_1) %>% summarise(non_na_count = sum(!is.na(col_2)))
# A tibble: 2 x 2
col_1 non_na_count
<fctr> <int>
1 A 1
2 B 2
Create all possible combinations of non-NA values for each group ID
Grouped by 'ID', fill
other columns, ungroup
to remove the group attribute and keep the distinct
rows
library(dplyr)
library(tidyr)
DF %>%
group_by(ID) %>%
fill(everything(), .direction = 'updown') %>%
ungroup %>%
distinct(.keep_all = TRUE)
Or may also be
DF %>%
group_by(ID) %>%
mutate(across(everything(), ~ replace(., is.na(.),
rep(.[!is.na(.)], length.out = sum(is.na(.))))))
Or based on the comments
DF %>%
group_by(ID) %>%
mutate(across(where(~ any(is.na(.))), ~ {
i1 <- is.na(.)
ind <- which(i1)
i2 <- !i1
if(i1[1] == 1) rep(.[i2], each = n()/sum(i2)) else
rep(.[i2], length.out = n())
})) %>%
ungroup %>%
distinct(.keep_all = TRUE)
-output
# A tibble: 6 x 5
ID Col1 Col2 Col3 Col4
<int> <int> <int> <int> <int>
1 1 6 10 15 20
2 1 5 10 15 20
3 2 17 25 21 34
4 2 13 25 21 34
5 2 17 25 35 40
6 2 13 25 35 40
Related Topics
Why Doesn't Comparison Between Numeric and Character Variables Give a Warning
Ggplot Set Scale_Color_Gradientn Manually
Str_Replace (Package Stringr) Cannot Replace Brackets in R
Random Sampling to Give an Exact Sum
Remove Rows Which Have All Nas in Certain Columns
A Vector to an Upper Triangle Matrix by Row in R
Fastest Way to Remove All Duplicates in R
Several Substitutions in One Line R
Sum Non Na Elements Only, But If All Na Then Return Na
How to Pass a Named Vector to Dplyr::Select Using Quosures
Adding a Table of Values Below the Graph in Ggplot2
Ggplot: Combining Size and Color in Legend
Customize Background to Highlight Ranges of Data in Ggplot
Suppress Automatic Output to Console in R
Preventing Column-Class Inference in Fread()
Create a Dataframe with Random Numbers in Each Column