Finding which element of a vector is between two values in R
You are looking for &, not &&:
x = c( .2, .4, 2.1, 5.3, 6.7, 10.5)
y = c( 1, 7)
x = x[ x >= y[1] & x <= y[2]]
x
# [1] 2.1 5.3 6.7
Edited to explain. Here's the text from ?'&'
.
& and && indicate logical AND and | and || indicate logical OR.
The shorter form performs elementwise comparisons in much the same way as arithmetic operators.
The longer form evaluates left to right examining only the first element of each vector.
Evaluation proceeds only until the result is determined.
So when you used &&
, it returned FALSE for the first element of your x
and terminated.
How to find common elements from multiple vectors?
There might be a cleverer way to go about this, but
intersect(intersect(a,b),c)
will do the job.
EDIT: More cleverly, and more conveniently if you have a lot of arguments:
Reduce(intersect, list(a,b,c))
Find the difference between all values of two vectors
sapply(a, "-", b)
# [,1] [,2] [,3] [,4] [,5]
# [1,] 0 1 2 3 4
# [2,] -1 0 1 2 3
# [3,] -2 -1 0 1 2
# [4,] -3 -2 -1 0 1
# [5,] -4 -3 -2 -1 0
# [6,] -5 -4 -3 -2 -1
# [7,] -6 -5 -4 -3 -2
# [8,] -7 -6 -5 -4 -3
# [9,] -8 -7 -6 -5 -4
#[10,] -9 -8 -7 -6 -5
Explanation
Taking advantage of the fact that a scalar minus a vector in R is an element-wise subtraction between said scalar and each element of the vector, we can simply apply the minus -
operator to each value in a
against the whole vector b
.
Finding elements that do not overlap between two vectors
Yes, there is a way:
setdiff(list.a, list.b)
# [1] "Mary" "Jack" "Michelle"
Check if column value is in between (range) of two other column values
We can loop over each x$number
using sapply
and check if it lies in range of any
of y$number1
and y$number2
and give the value accordingly.
x$found <- ifelse(sapply(x$number, function(p)
any(y$number1 <= p & y$number2 >= p)),"YES", NA)
x
# id number found
#1 1 5225 YES
#2 2 2222 <NA>
#3 3 3121 YES
Using the same logic but with replace
x$found <- replace(x$found,
sapply(x$number, function(p) any(y$number1 <= p & y$number2 >= p)), "YES")
EDIT
If we want to also compare the id
value we could do
x$found <- ifelse(sapply(seq_along(x$number), function(i) {
inds <- y$number1 <= x$number[i] & y$number2 >= x$number[i]
any(inds) & (x$id[i] == y$id[which.max(inds)])
}), "YES", NA)
x$found
#[1] "YES" NA "YES"
apply() to find closest value in 2 vectors
With sapply
, the option is
sapply(v1, function(x) which.min(abs(v2 - x)))
#[1] 4 7 3 9 2 10 4 9 5 1
Or with outer
max.col(-abs(outer(v1, v2, `-`)), 'first')
#[1] 4 7 3 9 2 10 4 9 5 1
Or using findInterval
i1 <- order(v1)
findInterval(v2, v1[i1])[i1]
R: compare the next two values in a vector with each other (without looping if possible)
Based on the edited version of the question, it's now clear that you need some sort of a looping function, because your decisions on previous indices affect your decisions on subsequent indices. The most efficient way I can think to do this would be to populate a logical vector indicating whether each index should be kept in the vector. Afterward you can use the logical vector to get both the remaining values and the indices that were removed.
x <- c(10, 7, 7, 10, 7, 10, 7, 10, 10, 7, 10, 10, 7, 7, 10, 10, 7, 10, 7, 7, 10, 7, 10)
keep <- rep(TRUE, length(x))
even <- TRUE
for (pos in 2:length(x)) {
if (even & x[pos] == x[pos-1]) {
keep[pos-1] <- FALSE
} else {
even <- !even
}
}
x[keep]
# [1] 10 7 7 10 7 10 7 10 10 7 10 7 7 10 10 7 10 7 7 10 7 10
which(!keep)
# [1] 11
As with any looping function, Rcpp can be used to get a speedup:
library(Rcpp)
cppFunction(
"LogicalVector getBin(NumericVector x) {
const int n = x.size();
LogicalVector keep(n, true);
bool even = true;
for (int pos=1; pos < n; ++pos) {
if (even && x[pos] == x[pos-1]) {
keep[pos-1] = false;
} else {
even = !even;
}
}
return keep;
}")
Benchmarking of the pure-R and Rcpp approaches:
# Slightly larger dataset
set.seed(144)
x <- sample(1:10, 1000, replace=T)
# Functions to compare
pureR <- function(x) {
keep <- rep(TRUE, length(x))
even <- TRUE
for (pos in 2:length(x)) {
if (even & x[pos] == x[pos-1]) {
keep[pos-1] <- FALSE
} else {
even <- !even
}
}
list(x[keep], which(!keep))
}
with.Rcpp <- function(x) {
keep <- getBin(x)
list(x[keep], which(!keep))
}
all.equal(pureR(x), with.Rcpp(x))
# [1] TRUE
library(microbenchmark)
microbenchmark(pureR(x), with.Rcpp(x))
# Unit: microseconds
# expr min lq mean median uq max neval
# pureR(x) 855.318 1066.177 1806.67855 1140.656 1442.869 35379.369 100
# with.Rcpp(x) 30.137 62.304 86.80656 78.132 94.771 348.598 100
With a vector of length 1000 we see a speedup of more than 10x from using Rcpp. Obviously this speedup would only be relevant for much larger vectors.
How to tell what is in one vector and not another?
you can use the setdiff() (set difference) function:
> setdiff(x, y)
[1] 1
Related Topics
R Leaflet Offline Tiles Within Shiny
Error with Scale_X_Labels in Ggplot2
Include Non-Cran Package in Cran Package
Rselenium on Docker: Where Are Files Downloaded
The Fastest Way to Convert Numeric to Character in R
Overlapped Density Plots in Ggplot2
Blockwise Sum of Matrix Elements
How to Calculate Euclidean Distance Between Two Matrices in R
Convert to Local Time Zone Using Latitude and Longitude
Ggplot2 Ggsave Function Causes Graphics Device to Not Display Plots
Is There an Equivalent in Ggplot to The Varwidth Option in Plot
Creating a Table with Individual Trials from a Frequency Table in R (Inverse of Table Function)
Fill Missing Values Rowwise (Right/Left)