R gotcha: logical-and operator for combining conditions is & not &&
From the help page for Logical Operators
, accessible by ?"&&"
:
& and && indicate logical AND and | and || indicate logical OR. The shorter form performs elementwise comparisons in much the same way as arithmetic operators. The longer form evaluates left to right examining only the first element of each vector. Evaluation proceeds only until the result is determined. The longer form is appropriate for programming control-flow and typically preferred in if clauses.
(R version 2.13-0)
In other words, when using subset
, use the single &
.
Here is an illustration of the difference:
c(1,1,0,0) & c(1,0,1,0)
[1] TRUE FALSE FALSE FALSE
c(1,1,0,0) && c(1,0,1,0)
[1] TRUE
If this looks quirky compared to other programming paradigms, remember that R needs to provide a vectorised form of the operator.
Is there a reason to prefer '&&' over '&' in 'if' statements, other than short-circuiting?
Short answer: Yes, the different symbol makes the meaning more clear to the reader.
Thanks for this interesting question! If I can summarize, it seems to be a follow-up specifically about this section of my answer to the question you linked,
... you want to use the long forms only when you are certain the
vectors are length one. You should be absolutely certain your vectors
are only length 1, such as in cases where they are functions that
return only length 1 booleans. You want to use the short forms if the
vectors are length possibly >1. So if you're not absolutely sure, you
should either check first, or use the short form and then use all and
any to reduce it to length one for use in control flow statements,
like if.
I hear your question (given comments) this way: But &
and &&
will do the same thing if the inputs are length one, so other than short-circuiting, why prefer &&
? Perhaps &
should be preferred because if they're not length one, if
will give me a warning, helping me be even more certain that the inputs are length one.
First, I agree with the comment by @James that you may be "overstating the value of getting a warning"; if it's not length one, the safer thing will be to handle this appropriately, not to just plow ahead. You could make a case that &&
should throw an error if they're not length one, and perhaps a good case; I don't know the reason why it does what it does. But without going back in time, the best we can do now is to check that the inputs are indeed appropriate for your use.
Given then, that you have checked to make sure your inputs are appropriate, I would still recommend &&
because it semantically reminds me as the reader that I should be making sure the inputs are scalars (length one). I'm so used to thinking vector-ally that this reminder is helpful to me. It follows the principle that different operations should have different symbols, and for me, a operation that is meant for use on scalars is different enough than a vectorized operation that it warrants a different symbol.
(Not to start a flame war (I hope), but this is also why I prefer <-
to =
; one for assigning variables, one for setting parameters to functions. Although deep down this is the same thing, it's different enough in practice to make the different symbols helpful to me as a reader.)
What is the difference between short (&,|) and long (&&, ||) forms of AND, OR logical operators in R?
&
and |
- are element-wise and can be used with vector operations, whereas, ||
and &&
always generate single TRUE
or FALSE
theck the difference:
> x <- 1:5
> y <- 5:1
> (x > 2) & (y < 3)
[1] FALSE FALSE FALSE TRUE TRUE
> (x > 2) && (y < 3) # here operaand && takes only 1'st elements from logical
# vectors (x>2) and (y<3)
> FALSE
So, &&
and ||
are commonly used in if (condition) state_1 else state_2
statements, as
dealing with vectors of length 1
Correctly Specifying Logical Conditions (in R)
UPDATE:
I think I was able to resolve this problem - now the "logical conditions" are respected in the final output:
#load libraries
library(dplyr)
library(mco)
#define function
funct_set <- function (x) {
x1 <- x[1]; x2 <- x[2]; x3 <- x[3] ; x4 <- x[4]; x5 <- x[5]; x6 <- x[6]; x[7] <- x[7]
f <- numeric(4)
#bin data according to random criteria
train_data <- train_data %>%
mutate(cat = ifelse(a1 <= x1 & b1 <= x3, "a",
ifelse(a1 <= x2 & b1 <= x4, "b", "c")))
train_data$cat = as.factor(train_data$cat)
#new splits
a_table = train_data %>%
filter(cat == "a") %>%
select(a1, b1, c1, cat)
b_table = train_data %>%
filter(cat == "b") %>%
select(a1, b1, c1, cat)
c_table = train_data %>%
filter(cat == "c") %>%
select(a1, b1, c1, cat)
#calculate quantile ("quant") for each bin
table_a = data.frame(a_table%>% group_by(cat) %>%
mutate(quant = ifelse(c1 > x[5],1,0 )))
table_b = data.frame(b_table%>% group_by(cat) %>%
mutate(quant = ifelse(c1 > x[6],1,0 )))
table_c = data.frame(c_table%>% group_by(cat) %>%
mutate(quant = ifelse(c1 > x[7],1,0 )))
f[1] = mean(table_a$quant)
f[2] = mean(table_b$quant)
f[3] = mean(table_c$quant)
#group all tables
final_table = rbind(table_a, table_b, table_c)
# calculate the total mean : this is what needs to be optimized
f[4] = mean(final_table$quant)
return (f);
}
gn <- function(x) {
g1 <- x[3] - x[1]
g2<- x[4] - x[2]
g3 <- x[7] - x[6]
g4 <- x[6] - x[5]
return(c(g1,g2,g3,g4))
}
optimization <- nsga2(funct_set, idim = 7, odim = 4 , constraints = gn, cdim = 4,
generations=150,
popsize=100,
cprob=0.7,
cdist=20,
mprob=0.2,
mdist=20,
lower.bounds=rep(80,80,80,80, 100,200,300),
upper.bounds=rep(120,120,120,120,200,300,400)
)
Now, if we take a look at the output:
#view output
optimization
All the logical conditions (i.e. the "constraints") are now respected!
Note: if possible, I would still be interested in seeing alternate ways to solve this problem
Thanks everyone!
Understanding when the && operator short circuits
This doesn't make sense to me, because && should evaluate left to
right, and stop as soon as one of its conditions is true.
This is wrong. You are mixing up &&
with ||
:
TRUE && FALSE
givesFALSE
- && requires both conditions to be TRUE
&&
will short-circuit on FALSE
TRUE || FALSE
givesTRUE
||
requires a single condition to be TRUE||
will short-circuit on TRUE
Also,
TRUE || NA
gives
TRUE
R - Unexpected output when running a function on a dataframe
You need vectorised ifelse
with a single &
(instead of &&
) if you want to test a condition on every element of a vector.
From ?ifelse
‘ifelse’ returns a value with the same shape as ‘test’ which is
filled with elements selected from either ‘yes’ or ‘no’ depending
on whether the element of ‘test’ is ‘TRUE’ or ‘FALSE’.
From ?`&&`
‘&’ and ‘&&’ indicate logical AND and ‘|’ and ‘||’ indicate
logical OR. The shorter form performs elementwise comparisons in
much the same way as arithmetic operators. The longer form
evaluates left to right examining only the first element of each
vector. Evaluation proceeds only until the result is determined.
The longer form is appropriate for programming control-flow and
typically preferred in ‘if’ clauses.
The short form &
performs an element-wise comparison, while &&
evaluates only the first element of the vector.
Here is an example based on your df
f1 <- function(x) if (x < 32 && x > 0) x + 100 else x - 100;
f2 <- function(x) ifelse(x < 32 & x > 0, x + 100, x - 100);
f1(df$A)
#[1] 110 120 130 140
f2(df$A)
#[1] 110 120 130 -60
R multiple conditions in row selection of matrix
You need to use a single '&':
dataOnBoth = data[data$value_1 > 0 & data$value_2 > 0,]
See this question for more details.
Two conditions in one if statement does the second matter if the first is false?
It is common for languages (Java and Python are among them) to evaluate the first argument of a logical AND
and finish evaluation of the statement if the first argument is false
. This is because:
From The Order of Evaluation of Logic Operators,
When Java evaluates the expression d = b && c;, it first checks whether b is true. Here b is false, so b && c must be false regardless of whether c is or is not true, so Java doesn't bother checking the value of c.
This is known as short-circuit evaluation, and is also referred to in the Java docs.
It is common to see list.count > 0 && list[0] == "Something"
to check a list element, if it exists.
It is also worth mentioning that if (list.length>2 && list[3] == 2)
is not equal to the second case
if (list.length>2){
if (list[3] == 2){
...
}
}
if there is an else
afterwards. The else
will apply only to the if
statement to which it is attached.
To demonstrate this gotcha:
if (x.kind.equals("Human")) {
if (x.name.equals("Jordan")) {
System.out.println("Hello Jordan!");
}
} else {
System.out.println("You are not a human!");
}
will work as expected, but
if (x.kind.equals("Human") && x.name.equals("Jordan")) {
System.out.println("Hello Jordan!");
} else {
System.out.println("You are not a human!");
}
will also tell any Human who isn't Jordan
they are not human.
Related Topics
Number of Significant Digits in Dplyr Summarise
How to Show Only Part of the Plot Area of Polar Ggplot with Facet
Install Rtools on R Version 3.0.2
How to Reorder a Legend in Ggplot2
Linear Regression with a Known Fixed Intercept in R
Data.Table and Parallel Computing
How to Get Coefficients and Their Confidence Intervals in Mixed Effects Models
How to Change 'Maximum Upload Size Exceeded' Restriction in Shiny and Save User File Inputs
Exact Number of Bins in Histogram in R
Setting Function Defaults R on a Project Specific Basis
Which Is the Best Method to Apply a Script Repetitively to N .CSV Files in R
Assigning Dates to Fiscal Year
Colour Points in a Plot Differently Depending on a Vector of Values