How to Include Na in Ifelse

How to include NA in ifelse?

You can't really compare NA with another value, so using == would not work. Consider the following:

NA == NA
# [1] NA

You can just change your comparison from == to %in%:

ifelse(is.na(test$time) | test$type %in% "A", NA, "1")
# [1] NA "1" NA "1"

Regarding your other question,

I could get this to work with my existing code if I could somehow change the result of is.na(test$type) to return FALSE instead of TRUE, but I'm not sure how to do that.

just use ! to negate the results:

!is.na(test$time)
# [1] TRUE TRUE FALSE TRUE

Ifelse statement order with is.na using R Dplyr Mutate

When comparing with == NA values return NA. When the first statement returns an NA value it doesn't go and check the next ifelse statement. To go to the next ifelse statement it needs a FALSE value.

p1$Value == 1
#[1] TRUE TRUE FALSE NA NA

A workaround would be to use %in% instead of == which returns FALSE for NA values.

p1$Value %in% 1
#[1] TRUE TRUE FALSE FALSE FALSE

library(dplyr)

p1 %>% mutate(NewCol = ifelse(Value %in% 1, "Test1Yes",
ifelse(is.na(Value), "TestYes",
ifelse(Value %in% 0, "Test0Yes","No"))))

# col1 Value NewCol
#1 var1 1 Test1Yes
#2 var2 1 Test1Yes
#3 var3 0 Test0Yes
#4 var4 NA TestYes
#5 var5 NA TestYes

You can also get the desired behaviour using case_when statement instead of nested ifelse.

p1 %>% 
mutate(NewCol = case_when(Value == 1 ~ "Test1Yes",
is.na(Value) ~ "TestYes",
Value == 0 ~ "Test0Yes",
TRUE ~ "No"))

# col1 Value NewCol
#1 var1 1 Test1Yes
#2 var2 1 Test1Yes
#3 var3 0 Test0Yes
#4 var4 NA TestYes
#5 var5 NA TestYes

How to handle NAs in ifelse when creating new column

Just to explain why your version does not work: NA == NA is not TRUE, it's NA - conceptually this makes sense, usually we want to know if two values are the same, and if we don't know one or both of them, we don't know of they are the same or not. To test if a value is NA you need to use the function is.NA(). Here's a simple version:

df_addvar3 <- df %>%
mutate(var3 = ifelse(is.na(var1), var2, var1))

Your question was not quite clear what you want to happen if the values are different from -1:1, or if var1 and var2 are both not NA, but different from one another. All of these should be relatively simple to add if necessary.

R handling NA values while doing a comparison ifelse

Please see the following SO post: How to ignore NA in ifelse statement

With respect to your question:

df$counting <- ifelse(df$age > 5 & df$age < 8 & !is.na(df$age), 1, 0) + ifelse(df$marks > 60 & df$marks < 70, 1, 0)
> df
sex occupation age marks counting
1 M Student NA 34 0
2 F Analyst 6 65 2
3 M Analyst 9 21 0

How to ignore NA in ifelse statement

This syntax is easier to read:

x <- c(NA, 1, 0, -1)

(x > 0) & (!is.na(x))
# [1] FALSE TRUE FALSE FALSE

(The outer parentheses aren't necessary, but will make the statement easier to read for almost anyone other than the machine.)


Edit:

## If you want 0s and 1s
((x > 0) & (!is.na(x))) * 1
# [1] 0 1 0 0

Finally, you can make the whole thing into a function:

isPos <- function(x) {
(x > 0) & (!is.na(x)) * 1
}

isPos(x)
# [1] 0 1 0 0

Direct way of telling ifelse to ignore NA

You can use %in% instead of == to sort-of ignore NAs.

ifelse(df$a %in% 1, "a==1", 
ifelse(df$b %in% 1, "b==1",
ifelse(df$c %in% 1, "c==1", NA)))

Unfortunately, this does not give any performance gain compared to the original while @arkun's solution is about 3 times faster.

solution_original <- function(){
ifelse(df$a==1 & !is.na(df$a), "a==1",
ifelse(df$b==1 & !is.na(df$b), "b==1",
ifelse(df$c==1 & !is.na(df$c), "c==1", NA)))
}

solution_akrun <- function(){
v1 <- names(df)[max.col(!is.na(df)) * NA^!rowSums(!is.na(df))]
i1 <- !is.na(v1)
v1[i1] <- paste0(v1[i1], "==1")
}

solution_mine <- function(x){
ifelse(df$a %in% 1, "a==1",
ifelse(df$b %in% 1, "b==1",
ifelse(df$c %in% 1, "c==1", NA)))
}
set.seed(1)
df <- data.frame(a = sample(c(1, rep(NA, 4)), 1e6, T),
b = sample(c(1, rep(NA, 4)), 1e6, T),
c = sample(c(1, rep(NA, 4)), 1e6, T))
microbenchmark::microbenchmark(
solution_original(),
solution_akrun(),
solution_mine()
)
## Unit: milliseconds
## expr min lq mean median uq max neval
## solution_original() 701.9413 839.3715 845.0720 853.1960 875.6151 1051.6659 100
## solution_akrun() 217.4129 242.5113 293.2987 253.2144 387.1598 564.3981 100
## solution_mine() 698.7628 845.0822 848.6717 858.7892 877.9676 1006.2872 100

Was inspired by this: R: Dealing with TRUE, FALSE, NA and NaN

Edit

Following the comment by @arkun, I redid the benchmark and revised the statement.

apply with ifelse statement and is.na does not 'sum' but outputs matrix - where is my logical mistake?

Here's a working version:

apply(dat[,2:3], MARGIN=1, function(x) 
{
if(all(is.na(x))) {
NA
} else {
sum(x==1, na.rm=TRUE)
}
}
)
#[1] 1 NA 0 2

Issues with yours:

  • Inside your function(x), x is the var1 and var2 values for a particular row. You don't want to go back and reference dat$var1 and dat$var2, which is the whole column! Just use x.
  • x== is.na(dat$var1) & is.na(dat$var2) is strange. It's trying to check whether x is the same as is.na(dat$var1)?
  • For a given row, we want to check whether all the values are NA. ifelse is vectorized and will return a vector - but we don't want a vector, we want a single TRUE or FALSE indicating whether all values are NA. So we use all(is.na()). And if() instead of ifelse.


Related Topics



Leave a reply



Submit