## Filter data.frame rows by a logical condition

To select rows according to *one* 'cell_type' (e.g. 'hesc'), use `==`

:

`expr[expr$cell_type == "hesc", ]`

To select rows according to two or more different 'cell_type', (e.g. either 'hesc' *or* 'bj fibroblast'), use `%in%`

:

`expr[expr$cell_type %in% c("hesc", "bj fibroblast"), ]`

## R: filter rows based on a condition in one column

You can first establish the indices of the first-pair parts using `which`

:

`library(dplyr)`

inds <- which(df$c == 3 & lead(df$c) == 1 & lead(df$d) - df$d < 10)

and then subset your dataframe on the indices *plus* 1:

`df[sort(unique(c(inds, inds + 1))),]`

d b c a

2 3403 100 3 1

3 3407 100 1 1

8 3436 100 3 1

9 3445 100 1 1

Alternatively, you can do:

`library(dplyr)`

df1 <- df %>% # get the first row

filter(c == 3 & lead(c) == 1 & lead(d) - d < 10)

df2 <- df %>% # get the second row

filter(lag(c) == 3 & c == 1 & d - lag(d) < 10)

arrange(rbind(df1, df2), d) # bind the two together and arange by d

## Subset and filter a dataframe by logical operators and select the foregoing rows

As noted in a comment, it does not make sense to filter rows that do not exist (there are none before row #1). Therefore, here's a solution for a filtering with slightly different parameters. Say, you want to filter target rows where `A == 11 & B == 90`

(this value combination also occurs 3 times in your data) and you want to get the five rows preceding the target rows. You can first define a function to get the indices of the rows in question:

`Sequ <- function(col1, col2) {`

# get row indices of target row with function `which`

inds <- which(col1 == 11 & col2 == 90)

# sort row indices of the rows before target row AND target row itself

sort(unique(c(inds-5, inds-4, inds-3,inds-2, inds-1, inds)))

}

Next you can use this function as input for `slice`

:

`library(dplyr)`

Sample_Data %>%

slice(Sequ(col1 = A, col2 = B))

A B

1 6 95

2 7 94

3 8 93

4 9 92

5 10 91

6 11 90

7 6 95

8 7 94

9 8 93

10 9 92

11 10 91

12 11 90

13 6 95

14 7 94

15 8 93

16 9 92

17 10 91

18 11 90

## Subset / filter rows in a data frame based on a condition in a column

Here are the two main approaches. I prefer this one for its readability:

`bar <- subset(foo, location == "there")`

Note that you can string together many conditionals with `&`

and `|`

to create complex subsets.

The second is the indexing approach. You can index rows in R with either numeric, or boolean slices. `foo$location == "there"`

returns a vector of `T`

and `F`

values that is the same length as the rows of `foo`

. You can do this to return only rows where the condition returns true.

`foo[foo$location == "there", ]`

## How to filter via a logical expression that filters via a variable

As an initial matter, it looks like you have a vector instead of a data frame (only one column). If you really do have a data frame and only ran str() on one column, the very similar technique at the end will work for you.

The first thing to know is that your dates are stored as character strings, while your yesterday object is in the Date format. R will not let you compare objects of different types, so you need to convert at least one of the two objects.

I suggest converting both to the POSIXct format so that you do not lose any information in your dates column but can still compare it to yesterday. Make sure to set the timezone to the same as your system time (mine is "America/New_York").

`Dates <- c("2021-09-09T06:04:35.689Z", "2021-09-09T06:04:35.690Z", "2021-09-09T06:04:35.260Z", "2021-09-24T06:04:35.260Z")`

Dates <- gsub("T", " ", Dates)

Dates <- gsub("Z", "", Dates)

Dates <- as.POSIXct(Dates, '%Y-%m-%d %H:%M:%OS', tz = "America/New_York")

yesterday <- Sys.time()-86400 #the number of seconds in one day

Now you can tell R to ignore the time any only compare the dates.

`trunc(Dates, units = c("days")) == trunc(yesterday, units = c("days"))]`

The other part of your question was about filtering. The easiest way to filter is subsetting. You first ask R for the indices of the matching values in your vector (or column) by wrapping your comparison in the `which()`

function.

`Indices <- which(trunc(Dates, units = c("days")) == trunc(yesterday, units = c("days"))])`

None of the dates in your str() results match yesterday, so I added one at the end that matches. Calling `which()`

returns a 4 to tell you that the fourth item in your vector matches yesterday's date. If more dates matched, it would have more values. I saved the results in "Indices"

We can then use the Indices from `which()`

to subset your vector or dataframe.

`Filtered_Dates <- Dates[Indices]`

Filtered_Dataframe <- df[Indices,] #note the comma, which indicates that we are filtering rows instead of columns.

## is it possible to subset a data.frame based on a row range AND a logical condition in r?

You can do the subsetting using either of the way.

- Based on logical vector :

`mtcars[seq(nrow(mtcars)) %in% 1:5 & mtcars$cyl==6,]`

# mpg cyl disp hp drat wt qsec vs am gear carb

#Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4

#Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4

#Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1

- Based on row range :

`mtcars[intersect(1:5, which(mtcars$cyl==6)),]`

## Filtering rows of a dataframe by column values

The `==`

with `&`

is not going to work anyway as we don't find the different 'Species' in the same cell. With that code, it would be `|`

instead of `&`

. But, this can be done more easily with `%in%`

on a vector of values e.g.

`subset(df1, Species %in% c("Mallard", "Wood-pigeon"))`

the

`c("Mallard", "Wood-pigeon")`

can be extended to any number of Species

## Consecutively filter rows satisfyingly if condition in R dataframe

Create the sequence in the order that you want to check the values so first 1 to 20 and then 0 and -1. `arrange`

the data so that the data is ordered by the correct sequence and select rows which is similar to first `dt_diff`

in the dataframe.

`librayr(dplyr)`

pbl_dt_seq <- c(1:20, 0, -1)

dt_df %>%

arrange(match(dt_diff, pbl_dt_seq)) %>%

filter(dt_diff == first(dt_diff))

# date ref_date dt_diff

# <date> <date> <dbl>

#1 2003-07-24 2003-07-26 2

#2 2003-07-24 2003-07-26 2

#3 2003-07-24 2003-07-26 2

#4 2003-07-24 2003-07-26 2

#5 2003-07-24 2003-07-26 2

#6 2003-07-24 2003-07-26 2

#7 2003-07-24 2003-07-26 2

#8 2003-07-24 2003-07-26 2

## pandas: filter rows of DataFrame with operator chaining

I'm not entirely sure what you want, and your last line of code does not help either, but anyway:

"Chained" filtering is done by "chaining" the criteria in the boolean index.

`In [96]: df`

Out[96]:

A B C D

a 1 4 9 1

b 4 5 0 2

c 5 5 1 0

d 1 3 9 6

In [99]: df[(df.A == 1) & (df.D == 6)]

Out[99]:

A B C D

d 1 3 9 6

If you want to chain methods, you can add your own mask method and use that one.

`In [90]: def mask(df, key, value):`

....: return df[df[key] == value]

....:

In [92]: pandas.DataFrame.mask = mask

In [93]: df = pandas.DataFrame(np.random.randint(0, 10, (4,4)), index=list('abcd'), columns=list('ABCD'))

In [95]: df.ix['d','A'] = df.ix['a', 'A']

In [96]: df

Out[96]:

A B C D

a 1 4 9 1

b 4 5 0 2

c 5 5 1 0

d 1 3 9 6

In [97]: df.mask('A', 1)

Out[97]:

A B C D

a 1 4 9 1

d 1 3 9 6

In [98]: df.mask('A', 1).mask('D', 6)

Out[98]:

A B C D

d 1 3 9 6

### Related Topics

How to Spread Repeated Measures of Multiple Variables into Wide Format

Update Data Frame Via Function Doesn't Work

Lm' Summary Not Display All Factor Levels

Splitting a Dataframe String Column into Multiple Different Columns

Ggplot2: Setting Geom_Bar Baseline to 1 Instead of Zero

How to Add a Diagonal Line to a Plot

Using Ggplot2, How to Insert a Break in the Axis

Add X and Y Axis to All Facet_Wrap

Gsub a Every Element After a Keyword in R

How to Make a Great R Reproducible Example

How to Join (Merge) Data Frames (Inner, Outer, Left, Right)

How to Reshape Data from Long to Wide Format

Reshaping Data.Frame from Wide to Long Format

Why Are These Numbers Not Equal

How to Specifically Order Ggplot2 X Axis Instead of Alphabetical Order

Ggplot2 - Annotate Outside of Plot

How to Use R'S Ellipsis Feature When Writing Your Own Function