R: Check If Value from Dataframe Is Within Range Other Dataframe

R: Check if value from dataframe is within range other dataframe

Here's a simple base method that uses the OP's logic:

f <- function(vec, id) {
if(length(.x <- which(vec >= x$from & vec <= x$to & id == x$number))) .x else NA
}
y$name <- x$name[mapply(f, y$location, y$id_number)]
y
# location id_number name
#1 1.5 30 region 1
#2 2.8 30 region 2
#3 10.0 38 <NA>
#4 3.5 40 <NA>
#5 2.0 36 region 7

Check if column value from one dataframe is in between (range) of two other columns of second dataframe

I believe this gives you your desired result:


df1 %>%
left_join(df2 %>% rename_at(vars(Start, End, sample_id), paste0, "_2")) %>%
mutate(sample_id_new = case_when(Start < End_2 & Start > Start_2 ~ sample_id_2)) %>%
select(Chr, Start, End, Gene, sample_id, sample_id_new)

Output:

  Chr Start End  Gene sample_id sample_id_new
1 1 15 15 gene1 ss6 ss1
2 1 120 130 gene2 ss7 ss1
3 2 210 210 gene3 ss9 <NA>
4 3 210 210 gene3 ss9 ss1
5 4 450 450 gene3 ss10 <NA>

Check if a value in a dataframe is conditionally between a range of values specified by two columns of another dataframe

Use a non-equi join with data.table - convert the first data to data.table (setDT), create the filter column as logical (FALSE) values. Do a non-equi join, and assign (:=) the filter to TRUE, which changes the FALSE to TRUE only when the condition (abs(weight - th_weight) < 2) meets

library(data.table)
setDT(df1)[, filter := FALSE]
df1[df2, filter := abs(weight - th_weight) < 2,
on = .(low <= th_weight, high >= th_weight)]

-output

> df1
weight low high filter
<num> <num> <num> <lgcl>
1: 94.99610 94.99608 94.99613 TRUE
2: 95.00561 95.00558 95.00566 FALSE

data

df1 <- structure(list(weight = c(94.9961, 95.00561), low = c(94.99608, 
95.00558), high = c(94.99613, 95.00566)), class = "data.frame", row.names = c(NA,
-2L))

df2 <- structure(list(index = 1:5, th_weight = c(94.996092, 95.496336,
95.509906, 97.473292, 100.51906)), class = "data.frame", row.names = c(NA,
-5L))

Check if column value is in between (range) of two other column values

We can loop over each x$number using sapply and check if it lies in range of any of y$number1 and y$number2 and give the value accordingly.

x$found <- ifelse(sapply(x$number, function(p) 
any(y$number1 <= p & y$number2 >= p)),"YES", NA)
x

# id number found
#1 1 5225 YES
#2 2 2222 <NA>
#3 3 3121 YES

Using the same logic but with replace

x$found <- replace(x$found, 
sapply(x$number, function(p) any(y$number1 <= p & y$number2 >= p)), "YES")

EDIT

If we want to also compare the id value we could do

x$found <- ifelse(sapply(seq_along(x$number), function(i) {
inds <- y$number1 <= x$number[i] & y$number2 >= x$number[i]
any(inds) & (x$id[i] == y$id[which.max(inds)])
}), "YES", NA)

x$found
#[1] "YES" NA "YES"

Check if value in a dataframe is between two values in another dataframe

For loop solution use:

for v in df['volumne']:
df3 = df2[(df2['range_low'] < v) & (df2['range_high'] > v)]
print (df3)

For non loop solution is possible use cross join, but if large DataFrames there should be memory problem:

df = df.assign(a=1).merge(df2.assign(a=1), on='a', how='outer')
print (df)
volumne a range_low range_high price
0 11 1 10 20 1
1 11 1 21 30 2
2 24 1 10 20 1
3 24 1 21 30 2
4 30 1 10 20 1
5 30 1 21 30 2

df3 = df[(df['range_low'] < df['volumne']) & (df['range_high'] > df['volumne'])]
print (df3)
volumne a range_low range_high price
0 11 1 10 20 1
3 24 1 21 30 2

How to check if a column value is within a range of another two for each row in data table

This works idiomatically in data.table

dat[, inConf := ifelse(true >= low & true <= up,T,F)]

###alternatively with 0,1
dat[, inConf := ifelse(true >= low & true <= up,1,0)]

R: find rows in data frame within range of each other across multiple columns

You can check all the conditions using outer to form a logical matrix (remembering to exclude the self-matching diagonal), and apply the result to subset the ID column, pasting the result together into strings:

df$ID.matches <- apply(outer(df$lat,   df$lat,   function(x, y) abs(x - y) <   1) &
outer(df$lon, df$lon, function(x, y) abs(x - y) < 1) &
outer(df$score, df$score, function(x, y) abs(x - y) < 0.7) &
diag(nrow(df)) == 0,
MARGIN = 1,
function(x) paste(df$ID[x], collapse = ", "))
df
#> ID lat long score ID.matches
#> 1 1 41.5 -62.3 22.4 3, 7
#> 2 2 41.0 -70.2 21.9
#> 3 3 42.2 -63.0 22.7 1
#> 4 4 36.7 -72.9 20.0
#> 5 5 36.2 -62.4 24.1 6
#> 6 6 35.8 -61.7 24.7 5
#> 7 7 40.8 -61.9 22.1 1

Created on 2020-07-07 by the reprex package (v0.3.0)



Related Topics



Leave a reply



Submit