Subset/Filter Rows in a Data Frame Based on a Condition in a Column

Subset / filter rows in a data frame based on a condition in a column

Here are the two main approaches. I prefer this one for its readability:

bar <- subset(foo, location == "there")

Note that you can string together many conditionals with & and | to create complex subsets.

The second is the indexing approach. You can index rows in R with either numeric, or boolean slices. foo$location == "there" returns a vector of T and F values that is the same length as the rows of foo. You can do this to return only rows where the condition returns true.

foo[foo$location == "there", ]

How to filter/subset a data table based on condition on other columns in R?

Just group by Date and ID, count observations and filter when there are greater than one:

Data[, n:=.N, by = .(Date, ID)][n>1]
# Date ID Value n
# 1: 2020-01-04 1 189 3
# 2: 2020-01-04 1 654 3
# 3: 2020-01-04 1 333 3

subsetting a data frame based on a condition of one column

We can use %in%

a1 <- a[a$x %in% x,]

For subsetting only the column 'x'

a1 <- a[a$x %in% x, "x", drop=FALSE]

If we need to subset the column 'x' to create a vector based on the x vector

v1 <- a$x[a$x %in% x]

Filter rows dataframe based on condition in different dataframe using dplyr

What about something like this, with dplyr:

  df1 %>% 
left_join(df2) %>% #joining to have one dataset
filter(time <= end, time >= start) %>% # filter, you can use <, > in case
select(-c(2,3)) # remove useless column if necessary

# A tibble: 4 x 3
id time keep
<chr> <dbl> <lgl>
1 A 3 TRUE
2 A 5 TRUE
3 B 3 TRUE
4 B 4 TRUE

filter dataframe based on condition on another column in the dataframe in R

This is pretty easy to do with dplyr. You can do

library(dplyr)

dat %>%
group_by(ID) %>%
filter("A" %in% Score)

How to subset dataframe based on conditions between columns across rows depending on values

Grouped by 'period', 'work_place', create a column 'n' with the number of distinct 'id's, then grouped by 'id', filter those 'id's having all elements of 'n' as 1

library(dplyr)
mydf %>%
group_by(period, work_place) %>%
mutate(n = n_distinct(id)) %>%
group_by(id) %>%
filter(all(n ==1)) %>%
ungroup %>%
select(-n)

-output

# A tibble: 3 x 3
# id period work_place
# <chr> <dbl> <chr>
#1 A 1 x
#2 A 1 y
#3 D 2 k

Select rows from a dataframe based on a condition and then assign a priority number to them in a new column

Is this what you are looking for?

df = data.frame("no_of_cases" = c(12,22,34), "grid_number" = c(454,345,67))

df %>% arrange(desc(no_of_cases)) %>% mutate("priority" = rank(-no_of_cases))

Filter data.frame rows by a logical condition

To select rows according to one 'cell_type' (e.g. 'hesc'), use ==:

expr[expr$cell_type == "hesc", ]

To select rows according to two or more different 'cell_type', (e.g. either 'hesc' or 'bj fibroblast'), use %in%:

expr[expr$cell_type %in% c("hesc", "bj fibroblast"), ]

R: Filter a dataframe based on another dataframe

If you are only wanting to keep the rownames in e that occur in pf (or that don't occur, then use !rownames(e)), then you can just filter on the rownames:

library(tidyverse)

e %>%
filter(rownames(e) %in% rownames(pf))

Another possibility is to create a rownames column for both dataframes. Then, we can do the semi_join on the rownames (i.e., rn). Then, convert the rn column back to the rownames.

library(tidyverse)

list(e, pf) %>%
map(~ .x %>%
as.data.frame %>%
rownames_to_column('rn')) %>%
reduce(full_join, by = 'rn') %>%
column_to_rownames('rn')

Output

        JHU_113_2.CEL JHU_144.CEL JHU_173.CEL JHU_176R.CEL JHU_182.CEL JHU_186.CEL JHU_187.CEL JHU_188.CEL JHU_203.CEL
2315374 6.28274 6.79161 6.11265 6.13997 6.68056 6.48156 6.45415 6.04542 5.99176
2315376 5.81678 5.71165 6.02794 5.37082 5.95527 5.75999 5.87863 5.54830 6.35571
2315587 8.88557 8.95699 8.36898 8.28993 8.41361 8.64980 8.74305 8.31915 8.43548
2315588 6.28650 6.66750 6.07503 6.76625 6.19819 6.84260 6.13916 6.40219 6.45059
2315591 6.97515 6.61705 6.51994 6.74982 6.60917 6.55182 6.62240 6.44394 5.76592
2315595 5.94179 5.39178 5.09497 4.96199 2.96431 4.95204 5.00979 4.06493 5.38048
2315598 4.99420 5.56888 5.57912 5.43960 5.19249 5.87991 5.60540 5.09513 5.43618
2315603 7.67845 7.90005 7.47594 6.75087 7.62805 8.00069 7.34296 6.81338 7.52014
2315604 6.20952 6.59687 6.14608 5.70518 6.49572 6.12622 6.23690 6.39569 6.70869
2315640 5.85307 6.07303 6.41875 6.07282 6.28283 6.13699 6.16377 6.48616 6.34162


Related Topics



Leave a reply



Submit