Logical Operators (And, Or) with Na, True and False

Logical operators (AND, OR) with NA, TRUE and FALSE

To quote from ?Logic:

NA is a valid logical object. Where a component of x or y is NA, the
result will be NA if the outcome is ambiguous. In other words NA &
TRUE evaluates to NA, but NA & FALSE evaluates to FALSE. See the
examples below.

The key there is the word "ambiguous". NA represents something that is "unknown". So NA & TRUE could be either true or false, but we don't know. Whereas NA & FALSE will be false no matter what the missing value is.

Logical operations involving NA

According to ?"&"

NA is a valid logical object. Where a component of x or y is NA, the result will be NA if the outcome is ambiguous. In other words NA & TRUE evaluates to NA, but NA & FALSE evaluates to FALSE. See the examples below.

In the OP's condition, the first one evaluates to

TRUE & NA #(is.na(NA)#[1] TRUE;NA > 0#[1] NA)

and second is

FALSE & NA #(!is.na(NA)#[1] FALSE)

Why is NA | FALSE = NA?

Essentially, it asks whether at least one side is TRUE. As there is one TRUE value, the result is also TRUE.

It is the same as with:

1 > 0 | 0 > 2
[1] TRUE

Conversely, when it asks whether all sides are TRUE:

TRUE & FALSE
[1] FALSE

As with the numerical example:

1 > 0 & 0 > 2
[1] FALSE

Dealing with TRUE, FALSE, NA and NaN

To answer your questions in order:

1) The == operator does indeed not treat NA's as you would expect it to. A very useful function is this compareNA function from r-cookbook.com:

  compareNA <- function(v1,v2) {
# This function returns TRUE wherever elements are the same, including NA's,
# and false everywhere else.
same <- (v1 == v2) | (is.na(v1) & is.na(v2))
same[is.na(same)] <- FALSE
return(same)
}

2) NA stands for "Not available", and is not the same as the general NaN ("not a number"). NA is generally used for a default value for a number to stand in for missing data; NaN's are normally generated because a numerical issue (taking log of -1 or similar).

3) I'm not really sure what you mean by "logical things"--many different data types, including numeric vectors, can be used as input to logical operators. You might want to try reading the R logical operators page: http://stat.ethz.ch/R-manual/R-patched/library/base/html/Logic.html.

Hope this helps!

logical operator TRUE/FALSE in R

You can pass x and y vector separately to the function. Use expand.grid to create all combinations of the vector and get max of x and min of y from each row.

intervals<-function(x, y){
tmp <- do.call(expand.grid, rbind.data.frame(x, y))
names(tmp) <- paste0('col', seq_along(tmp))
result <- t(apply(tmp, 1, function(p) {
suppressWarnings(c(max(p[p %in% x]), min(p[p %in% y])))
}))
result[is.infinite(result)] <- NA
result <- as.data.frame(result)
names(result) <- c('max_x', 'min_x')
result
}

intervals(c(2,3,6,82), c(10, 90, 50, 7))

# max_x min_x
#1 82 NA
#2 82 10
#3 82 90
#4 82 10
#5 82 50
#6 82 10
#7 82 50
#8 82 10
#9 6 7
#10 6 7
#11 6 7
#12 6 7
#13 3 7
#14 3 7
#15 2 7
#16 NA 7

In R, (F & NA) is F but (T & NA) is NA -- why?

If you have an AND (&) statement and one of the values is false, then it doesn't matter what the other value is, the answer is going to be false. The NA value means that a value is missing, but the unobserved value must be a true or false and either way you're going to get false back.

But if one of the values is true, then the AND will only be true if the second value is also true. However in this case the missing value (NA), could be true or false so it's impossible to say whether the expression will be. Thus R has to propagate the NA value.

How do and and or act with non-boolean values?

TL;DR

We start by summarising the two behaviour of the two logical operators and and or. These idioms will form the basis of our discussion below.

and

Return the first Falsy value if there are any, else return the last
value in the expression.

or

Return the first Truthy value if there are any, else return the last
value in the expression.

The behaviour is also summarised in the docs, especially in this table:























OperationResult
x or yif x is false, then y, else x
x and yif x is false, then x, else y
not xif x is false, then True, else False

How to process NA as False in R

For me, I'd think the most beneficial way would be to use a dplyr's case_when function and explicitly state how the NA cases you mention should be handled.

Replicating your example (notice that I'm explicitly setting the NAs here. Your NAs were the result of R not being able to handle a character string ("NA") within a numeric vector.

col1 = as.numeric(c(10, 2, 15, 2, NA_real_, 15))
col2 = as.numeric(c(15, 15, 2, 2, 15, NA_real_))
test <- data.frame(col1, col2)

For both the mutate function and case_when function I'm loading dplyr. If you're not familiar with case_when it's like a ifelse with multiple conditionals. Each conditional is followed by a "~" tilde. What comes after the tilde is what gets assigned if the conditional is met. To set "everything else" as some value X you type TRUE ~ "x" as that obviously gets evaluated as true for all the other cases that have not been met in the previous conditionals.

This should do what you want:

library(dplyr)

test <- mutate(.data = test,
G5 = case_when(col1 > 5 & col2 > 5 ~ "Yes", #Original
(is.na(col1) & col2 > 5) | (col1 > 5 & is.na(col2)) ~ "Yes",
TRUE ~ "No")) # Everything else gets the value "No"


test
#> col1 col2 G5
#> 1 10 15 Yes
#> 2 2 15 No
#> 3 15 2 No
#> 4 2 2 No
#> 5 NA 15 Yes
#> 6 15 NA Yes


Related Topics



Leave a reply



Submit