How to delete groups containing less than 3 rows of data in R?
One way to do it is to use the magic n()
function within filter
:
library(dplyr)
my_data <- data.frame(Year=1996, Site="A", Brood=c(1,1,2,2,2))
my_data %>%
group_by(Year, Site, Brood) %>%
filter(n() >= 3)
The n()
function gives the number of rows in the current group (or the number of rows total if there is no grouping).
Remove groups with less than three unique observations
With data.table you could do:
library(data.table)
DT[, if(uniqueN(Day) >= 3) .SD, by = Group]
which gives:
Group Day
1: 1 1
2: 1 3
3: 1 5
4: 1 5
5: 3 1
6: 3 2
7: 3 3
Or with dplyr
:
library(dplyr)
DT %>%
group_by(Group) %>%
filter(n_distinct(Day) >= 3)
which gives the same result.
How to remove rows from a data frame based on a group condition, without losing some columns
We don't need the summarise
step. Instead use the logical expression directly in filter
library(dplyr)
df %>%
group_by(species, year) %>%
filter(n() > 1)
If we need to create the 'n_inds', then use either add_count
df %>%
add_count(species, year) %>%
filter(n > 1)
Or create the column with mutate
df %>%
group_by(species, year) %>%
mutate(ninds = n()) %>%
ungroup %>%
filter(ninds > 1)
When we use summarise
, it only returns the grouping columns and the summarised column
delete rows where amount or rows per group greater than value of grouped dataframe in r
You can do this with slice
:
library(dplyr)
df %>% group_by(Tracks) %>% slice(seq_len(max(Length))) %>% ungroup
# Tracks Length ID
# <chr> <dbl> <dbl>
#1 a 2 1
#2 a 2 2
#3 b 1 1
#4 c 2 1
#5 c 2 2
Or filter
:
df %>% group_by(Tracks) %>% filter(ID <= max(Length)) %>% ungroup
Remove group from data.frame if at least one group member meets condition
Try
library(dplyr)
df2 %>%
group_by(group) %>%
filter(!any(world == "AF"))
Or as per metionned by @akrun:
setDT(df2)[, if(!any(world == "AF")) .SD, group]
Or
setDT(df2)[, if(all(world != "AF")) .SD, group]
Which gives:
#Source: local data frame [7 x 3]
#Groups: group
#
# world place group
#1 AB 1 1
#2 AC 1 1
#3 AD 2 1
#4 AB 1 3
#5 AE 2 3
#6 AC 3 3
#7 AE 1 3
Remove Groups If Any Of Their Rows Meet Criteria
Use subset
keeping any group for which !any(x == 7, na.rm = TRUE)
is TRUE. This one-liner uses only base R.
subset(data, !ave(score, id, FUN = function(x) any(x == 7, na.rm = TRUE)))
giving:
id score
4 2 9
5 2 8
6 2 4
7 3 NA
8 3 11
9 3 3
R: Delete all values with less than 3 samples, count method without including NAs
You can count the number of non-NAs by ("by"!) NAME using by()
:
foo <- with(df,by(VALUE,NAME,function(xx)sum(!is.na(xx))))
foo
These NAMEs have at least three non-NAs:
names(which(foo>=3))
So you want:
df[df$NAME %in% names(which(foo>=3)),]
Related Topics
How to Save() with a Particular Variable Name
Add a New Column to a Dataframe Using Matching Values of Another Dataframe
Ggplot2: Color Individual Words in Title to Match Colors of Groups
Overlay Two Ggplot2 Stat_Density2D Plots with Alpha Channels
How Can a Data Ellipse Be Superimposed on a Ggplot2 Scatterplot
How to Replace Na (Missing Values) in a Data Frame with Neighbouring Values
Unexpected 'Else' in "Else" Error
Get the Row and Column Name of the Minimum Element of a Matrix
Set the Size of Ggsave Exactly
R Function for Returning All Factors
How to Programmatically Extract/Unzip a .7Z (7-Zip) File with R
Automatically Adjust Latex Table Width to Fit PDF Using Knitr and Rstudio
Can Rbind Be Parallelized in R
Apply a Ggplot-Function Per Group with Dplyr and Set Title Per Group
Replace a Value Na with the Value from Another Column in R
Lda with Topicmodels, How to See Which Topics Different Documents Belong To
In R Data.Table, How to Pass Variable Parameters to an Expression