Logical operators (AND, OR) with NA, TRUE and FALSE
To quote from ?Logic
:
NA is a valid logical object. Where a component of x or y is NA, the
result will be NA if the outcome is ambiguous. In other words NA &
TRUE evaluates to NA, but NA & FALSE evaluates to FALSE. See the
examples below.
The key there is the word "ambiguous". NA
represents something that is "unknown". So NA & TRUE
could be either true or false, but we don't know. Whereas NA & FALSE
will be false no matter what the missing value is.
Logical operations involving NA
According to ?"&"
NA is a valid logical object. Where a component of x or y is NA, the result will be NA if the outcome is ambiguous. In other words NA & TRUE evaluates to NA, but NA & FALSE evaluates to FALSE. See the examples below.
In the OP's condition, the first one evaluates to
TRUE & NA #(is.na(NA)#[1] TRUE;NA > 0#[1] NA)
and second is
FALSE & NA #(!is.na(NA)#[1] FALSE)
Why is NA | FALSE = NA?
Essentially, it asks whether at least one side is TRUE
. As there is one TRUE
value, the result is also TRUE
.
It is the same as with:
1 > 0 | 0 > 2
[1] TRUE
Conversely, when it asks whether all sides are TRUE
:
TRUE & FALSE
[1] FALSE
As with the numerical example:
1 > 0 & 0 > 2
[1] FALSE
Dealing with TRUE, FALSE, NA and NaN
To answer your questions in order:
1) The ==
operator does indeed not treat NA's as you would expect it to. A very useful function is this compareNA
function from r-cookbook.com:
compareNA <- function(v1,v2) {
# This function returns TRUE wherever elements are the same, including NA's,
# and false everywhere else.
same <- (v1 == v2) | (is.na(v1) & is.na(v2))
same[is.na(same)] <- FALSE
return(same)
}
2) NA stands for "Not available", and is not the same as the general NaN ("not a number"). NA is generally used for a default value for a number to stand in for missing data; NaN's are normally generated because a numerical issue (taking log of -1 or similar).
3) I'm not really sure what you mean by "logical things"--many different data types, including numeric vectors, can be used as input to logical operators. You might want to try reading the R logical operators page: http://stat.ethz.ch/R-manual/R-patched/library/base/html/Logic.html.
Hope this helps!
logical operator TRUE/FALSE in R
You can pass x
and y
vector separately to the function. Use expand.grid
to create all combinations of the vector and get max
of x
and min
of y
from each row.
intervals<-function(x, y){
tmp <- do.call(expand.grid, rbind.data.frame(x, y))
names(tmp) <- paste0('col', seq_along(tmp))
result <- t(apply(tmp, 1, function(p) {
suppressWarnings(c(max(p[p %in% x]), min(p[p %in% y])))
}))
result[is.infinite(result)] <- NA
result <- as.data.frame(result)
names(result) <- c('max_x', 'min_x')
result
}
intervals(c(2,3,6,82), c(10, 90, 50, 7))
# max_x min_x
#1 82 NA
#2 82 10
#3 82 90
#4 82 10
#5 82 50
#6 82 10
#7 82 50
#8 82 10
#9 6 7
#10 6 7
#11 6 7
#12 6 7
#13 3 7
#14 3 7
#15 2 7
#16 NA 7
In R, (F & NA) is F but (T & NA) is NA -- why?
If you have an AND (&
) statement and one of the values is false, then it doesn't matter what the other value is, the answer is going to be false. The NA
value means that a value is missing, but the unobserved value must be a true or false and either way you're going to get false back.
But if one of the values is true, then the AND will only be true if the second value is also true. However in this case the missing value (NA), could be true or false so it's impossible to say whether the expression will be. Thus R has to propagate the NA value.
How do and and or act with non-boolean values?
TL;DR
We start by summarising the two behaviour of the two logical operators and
and or
. These idioms will form the basis of our discussion below.
and
Return the first Falsy value if there are any, else return the last
value in the expression.
or
Return the first Truthy value if there are any, else return the last
value in the expression.
The behaviour is also summarised in the docs, especially in this table:
Operation | Result |
---|---|
x or y | if x is false, then y, else x |
x and y | if x is false, then x, else y |
not x | if x is false, then True , else False |
How to process NA as False in R
For me, I'd think the most beneficial way would be to use a dplyr
's case_when
function and explicitly state how the NA
cases you mention should be handled.
Replicating your example (notice that I'm explicitly setting the NAs here. Your NAs were the result of R not being able to handle a character string ("NA") within a numeric vector.
col1 = as.numeric(c(10, 2, 15, 2, NA_real_, 15))
col2 = as.numeric(c(15, 15, 2, 2, 15, NA_real_))
test <- data.frame(col1, col2)
For both the mutate
function and case_when
function I'm loading dplyr
. If you're not familiar with case_when
it's like a ifelse with multiple conditionals. Each conditional is followed by a "~" tilde. What comes after the tilde is what gets assigned if the conditional is met. To set "everything else" as some value X you type TRUE ~ "x"
as that obviously gets evaluated as true for all the other cases that have not been met in the previous conditionals.
This should do what you want:
library(dplyr)
test <- mutate(.data = test,
G5 = case_when(col1 > 5 & col2 > 5 ~ "Yes", #Original
(is.na(col1) & col2 > 5) | (col1 > 5 & is.na(col2)) ~ "Yes",
TRUE ~ "No")) # Everything else gets the value "No"
test
#> col1 col2 G5
#> 1 10 15 Yes
#> 2 2 15 No
#> 3 15 2 No
#> 4 2 2 No
#> 5 NA 15 Yes
#> 6 15 NA Yes
Related Topics
Read.Csv, Header on First Line, Skip Second Line
Why Does R Use Partial Matching
Drop-Down Checkbox Input in Shiny
Subsetting Data.Table Using Variables with Same Name as Column
File Path Issues in R Using Windows ("Hex Digits in Character String" Error)
Add Error Bars to Show Standard Deviation on a Plot in R
How to Get Name of Variable in R (Substitute)
Split Code Over Multiple Lines in an R Script
Use Different Center Than the Prime Meridian in Plotting a World Map
Why True == "True" Is True in R
Building R Package and Error "Ld: Cannot Find -Lgfortran"
R Knitr Chunk Options for Figure Height/Width Are Not Working
Automatically Delete Files/Folders
Find Common Substrings Between Two Character Variables
Using Gsub to Extract Character String Before White Space in R
Select Only the First Row When Merging Data Frames with Multiple Matches