How to Delete Groups Containing Less Than 3 Rows of Data in R

How to delete groups containing less than 3 rows of data in R?

One way to do it is to use the magic n() function within filter:

library(dplyr)

my_data <- data.frame(Year=1996, Site="A", Brood=c(1,1,2,2,2))

my_data %>%
group_by(Year, Site, Brood) %>%
filter(n() >= 3)

The n() function gives the number of rows in the current group (or the number of rows total if there is no grouping).

Remove groups with less than three unique observations

With data.table you could do:

library(data.table)
DT[, if(uniqueN(Day) >= 3) .SD, by = Group]

which gives:

   Group Day
1: 1 1
2: 1 3
3: 1 5
4: 1 5
5: 3 1
6: 3 2
7: 3 3

Or with dplyr:

library(dplyr)
DT %>%
group_by(Group) %>%
filter(n_distinct(Day) >= 3)

which gives the same result.

How to remove rows from a data frame based on a group condition, without losing some columns

We don't need the summarise step. Instead use the logical expression directly in filter

library(dplyr)
df %>%
group_by(species, year) %>%
filter(n() > 1)

If we need to create the 'n_inds', then use either add_count

df %>%
add_count(species, year) %>%
filter(n > 1)

Or create the column with mutate

df %>%
group_by(species, year) %>%
mutate(ninds = n()) %>%
ungroup %>%
filter(ninds > 1)

When we use summarise, it only returns the grouping columns and the summarised column

delete rows where amount or rows per group greater than value of grouped dataframe in r

You can do this with slice :

library(dplyr)

df %>% group_by(Tracks) %>% slice(seq_len(max(Length))) %>% ungroup

# Tracks Length ID
# <chr> <dbl> <dbl>
#1 a 2 1
#2 a 2 2
#3 b 1 1
#4 c 2 1
#5 c 2 2

Or filter :

df %>%  group_by(Tracks) %>% filter(ID <= max(Length)) %>% ungroup

Remove group from data.frame if at least one group member meets condition

Try

library(dplyr)
df2 %>%
group_by(group) %>%
filter(!any(world == "AF"))

Or as per metionned by @akrun:

setDT(df2)[, if(!any(world == "AF")) .SD, group]

Or

setDT(df2)[, if(all(world != "AF")) .SD, group]

Which gives:

#Source: local data frame [7 x 3]
#Groups: group
#
# world place group
#1 AB 1 1
#2 AC 1 1
#3 AD 2 1
#4 AB 1 3
#5 AE 2 3
#6 AC 3 3
#7 AE 1 3

Remove Groups If Any Of Their Rows Meet Criteria

Use subset keeping any group for which !any(x == 7, na.rm = TRUE) is TRUE. This one-liner uses only base R.

subset(data, !ave(score, id, FUN = function(x) any(x == 7, na.rm = TRUE)))

giving:

  id score
4 2 9
5 2 8
6 2 4
7 3 NA
8 3 11
9 3 3

R: Delete all values with less than 3 samples, count method without including NAs

You can count the number of non-NAs by ("by"!) NAME using by():

foo <- with(df,by(VALUE,NAME,function(xx)sum(!is.na(xx))))
foo

These NAMEs have at least three non-NAs:

names(which(foo>=3))

So you want:

df[df$NAME %in% names(which(foo>=3)),]


Related Topics



Leave a reply



Submit