Filtering a Data Frame on a Vector

Filtering a data frame on a vector

You can use the %in% operator:

> df <- data.frame(id=c(LETTERS, LETTERS), x=1:52)
> L <- c("A","B","E")
> subset(df, id %in% L)
   id  x
1   A  1
2   B  2
5   E  5
27  A 27
28  B 28
31  E 31

If your IDs are unique, you can use match():

> df <- data.frame(id=c(LETTERS), x=1:26)
> df[match(L, df$id), ]
  id x
1  A 1
2  B 2
5  E 5

or make them the rownames of your dataframe and extract by row:

> rownames(df) <- df$id
> df[L, ]
  id x
A  A 1
B  B 2
E  E 5

Finally, for more advanced users, and if speed is a concern, I'd recommend looking into the data.table package.

Filter data frame matching all values of a vector

Here's another dplyr solution without ever leaving the pipe:

ID <- c('A','A','A','A','A','B','B','B','B','C','C')
Hour <- c('0','2','5','6','9','0','2','5','6','0','2')

x <- data.frame(ID, Hour)

testVector <- c('0','2','5')

x %>%
  group_by(ID) %>%
  mutate(contains = Hour %in% testVector) %>%
  summarise(all = sum(contains)) %>%
  filter(all > 2) %>%
  select(-all) %>%
  inner_join(x)

##       ID   Hour
##   <fctr> <fctr>
## 1      A      0
## 2      A      2
## 3      A      5
## 4      A      6
## 5      A      9
## 6      B      0
## 7      B      2
## 8      B      5
## 9      B      6

Filtering a dataframe by list of character vectors

Try this:

#Code
L <- lapply(ls,function(x) data.frame(type=x[x %in% df$type]))
names(L) <- paste0('new_df_',c('fruit','vegetable'))

Output:

L
$new_df_fruit
    type
1  Apple
2 Cherry

$new_df_vegetable
       type
1 Courgette

How to filter a dataframe with a character vector

There are multiple issues. First, you need to quote inside quotation for the second condition:

conditions <- c("Sepal.Width < 3.2", "Species == 'setosa'")

Then, you need to specify the association between the two conditions. Here, I assumed an &. Then you can use eval(parse(...)):

iris %>%
 filter(eval(parse(text = paste(conditions, sep = "&"))))

   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1           5.1         3.5          1.4         0.2  setosa
2           4.9         3.0          1.4         0.2  setosa
3           4.7         3.2          1.3         0.2  setosa
4           4.6         3.1          1.5         0.2  setosa
5           5.0         3.6          1.4         0.2  setosa
6           5.4         3.9          1.7         0.4  setosa
7           4.6         3.4          1.4         0.3  setosa
8           5.0         3.4          1.5         0.2  setosa
9           4.4         2.9          1.4         0.2  setosa
10          4.9         3.1          1.5         0.1  setosa

On the other hand, I think it is always important to quote @Martin Mächler to warn about the potential problems associated with this approach:

The (possibly) only connection is via parse(text = ....) and all good
R programmers should know that this is rarely an efficient or safe
means to construct expressions (or calls). Rather learn more about
substitute(), quote(), and possibly the power of using
do.call(substitute, ......).

Filter data in loop over vector and bind data frames

I think you've got a typo/error in your filter; do you get the correct output when you change "block" to "value" in your grepl? E.g.

library(tidyverse)
area <- data.frame(
  land = c("68N03E220090", "68N03E244635", "68N03E244352", "68N03E223241"),
  type = c("home", "mobile", "home", "vacant"),
  object_id = c(NA, 7, NA, 34)
)

block <- c("68N03E22", "68N03E24")

datalist = list()

for (value in block){
  df <- area %>% filter(is.na(object_id) & grepl(paste0("^", value),land))
  df$value <- value
  datalist[[value]] <- df # add it to your list
}

df_filtered <- dplyr::bind_rows(datalist)

df_filtered
#>           land type object_id    value
#> 1 68N03E220090 home        NA 68N03E22
#> 2 68N03E244352 home        NA 68N03E24

For this example, you could also avoid the for-loop by using:

df_filtered_2 <- area %>%
  filter(is.na(object_id) & grepl(pattern = paste0(block, collapse = "|"), x = land)) %>% 
  mutate(value = str_sub(land, 1, 8))

identical(df_filtered, df_filtered_2)
#> [1] TRUE

How to filter a dataframe using a preset vector in R

Use %in%:

df %>% 
  filter(code %in% x)

Filtering vector by values with filter()

I assume that your data is in characters, so to filter that you first have to convert that to numeric. After that you can filter the conditions using one filter function with & operation. You can use the following code:

dat <- data.frame(Duration..in.seconds. = c("114",  "188",  "453",  "114" , "188" , "453" , "114" , "188" , "453" , "188" , "453",  "2000" ,"2000" ,"1900" ))

library(dplyr)

dat = dat %>%
  mutate(Duration..in.seconds. = as.numeric(Duration..in.seconds.)) %>%
  filter(Duration..in.seconds. > 180 & Duration..in.seconds.  < 1800)

Output:

 Duration..in.seconds.
1                   188
2                   453
3                   188
4                   453
5                   188
6                   453
7                   188
8                   453

filter columns of a dataframe based on a vector

Here's one way -

filter(df, apply(df, 1, function(a) all(a > x)))

  X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
1  8 10  7  9  8  6 10  8  8   9

filter values in a list of dataframes based on a vector, and add rows for vector values not contained in dataframes

Create a data.frame with all the rows you want
data.frame(province=vector)
Merge this with the data frame you do have, setting all.x=TRUE (so every row from point 1 is retained, and filled with NA if necessary)
merge(data.frame(province=vector), df1, all.x=TRUE)
Done!

> merge(data.frame(province=vector), df1, all.x=TRUE)
  province value value2
1    prov1    23     25
2    prov2    NA     NA
3    prov3    56     57
4    prov4    NA     NA
5    prov5    93     83
6    prov6    NA     NA

Bonus 1: you can trivially loop this with lapply
lapply(list_df, function(df) merge(data.frame(province=vector), df, all.x=TRUE))
(if you have a lot of data frames you want to apply this to, you will probably want to avoid re-building the vector data frame anonymously each time but create it as a named data frame instead)
Bonus 2: all base-r with no dependencies whatsoever
Bonus 3: you did say it doesn't matter, but the rows are in order as in vector

Filtering a Data Frame on a Vector