Opposite of %In%: Exclude Rows With Values Specified in a Vector

Opposite of %in%: exclude rows with values specified in a vector

You can use the ! operator to basically make any TRUE FALSE and every FALSE TRUE. so:

D2 = subset(D1, !(V1 %in% c('B','N','T')))

EDIT:
You can also make an operator yourself:

'%!in%' <- function(x,y)!('%in%'(x,y))

c(1,3,11)%!in%1:10
[1] FALSE FALSE TRUE

What is the opposite of %in% in r

You want:

matrix[!matrix %in% 1,]

For clarity's sake, I prefer this, even though the parentheses aren't necessary.

matrix[!(matrix %in% 1),]

Also note that you need to be aware of FAQ 7.31: Why doesn't R think these numbers are equal?.

R - Exclude rows from dataframe that don't contain certain values

This way is nice, as it is scaleable to any number of columns that begin with X.

dplyr::filter_at(df, vars(starts_with("X")), any_vars(. %in% codes))

Negation of %in% in R

No, there isn't a built in function to do that, but you could easily code it yourself with

`%nin%` = Negate(`%in%`)

Or

`%!in%` = Negate(`%in%`)

See this thread and followup discussion: %in% operator - NOT IN (alternatively here)


Also, it was pointed out the package Hmisc includes the operator %nin%, so if you're using it for your applications it's already there.

library(Hmisc)
"A" %nin% "B"
#[1] TRUE
"A" %nin% "A"
#FALSE

How to subset columns based on threshold values specified for a subset of rows

A tidyverse solution is

df1 %>% 
select(
df1 %>%
filter(row_number() > 1) %>%
summarise(across(starts_with("c"), max)) %>%
pivot_longer(everything()) %>%
filter(value < 5) %>%
pull(name)
)
c1
1 500
2 1
3 0
4 3
5 0

Explanation: the code inside the select calculates the maximum value for each column after ignoring the first row. The result is then pivoted into long format, creating default columns name and value. This data frame is filtered to select only those columns where every value is less than five. The name column is then pulled and used as an argument to the outer select.

If you need other columns, just modify the select, for example,

df1 %>% 
select(
c("Gene",
df1 %>%
filter(row_number() > 1) %>%
summarise(across(starts_with("c"), max)) %>%
pivot_longer(everything()) %>%
filter(value < 5) %>%
pull(name)
)
)

How to use a condition on a whole data.frame with a vector as comparison?

apply(dat, 1, function(x) x %in% c(1,3))

[,1] [,2] [,3]
[1,] TRUE TRUE FALSE
[2,] TRUE TRUE TRUE
[3,] TRUE TRUE FALSE

How to count rows with NA values across a selection of columns and include 0 count?

You can do:

library(tidyverse)
df %>%
mutate(missing = apply(across(num_range('Var', 2:4)), 1, function(x) any(is.na(x)))) %>%
group_by(ID) %>%
summarize(n = sum(missing))


# A tibble: 3 x 2
ID n
<chr> <int>
1 AL01 2
2 AL02 1
3 AL03 0


Related Topics



Leave a reply



Submit