Removing groups with all NA in Data.Table or DPLYR in R
Is this what you want?
library(dplyr)
dataHAVE %>%
group_by(student) %>%
filter(!all(is.na(score)))
student time score
<dbl> <dbl> <dbl>
1 1 1 7
2 1 2 9
3 1 3 5
4 3 1 NA
5 3 2 3
6 3 3 9
7 5 NA 7
8 5 2 NA
9 5 3 5
Each student
is only kept if not (!
) all
score
values are NA
Remove NAs in each column by group
Here is one possible solution using data.table package:
library(data.table)
setDT(na_data)[, lapply(.SD, function(x) if(length(y<-na.omit(x))) y else first(x)), by=Year]
# Year Peter Paul John
# 1: 2011 1 1 NA
# 2: 2011 2 2 NA
# 3: 2011 3 3 NA
# 4: 2012 1 3 NA
# 5: 2012 2 2 NA
# 6: 2012 3 1 NA
# 7: 2013 1 1 4
# 8: 2013 2 2 5
# 9: 2013 3 3 6
dplyr equivalent:
library(dplyr)
na_data |>
group_by(Year) |>
summarise(across(.fns = ~ if(length(y<-na.omit(.x))) y else first(.x)))
# # A tibble: 9 x 4
# # Groups: Year [3]
# Year Peter Paul John
# <dbl> <dbl> <dbl> <int>
# 1 2011 1 1 NA
# 2 2011 2 2 NA
# 3 2011 3 3 NA
# 4 2012 1 3 NA
# 5 2012 2 2 NA
# 6 2012 3 1 NA
# 7 2013 1 1 4
# 8 2013 2 2 5
# 9 2013 3 3 6
Remove groups which do not have non-consecutive NA values in R
How about using difference between the index of NA-values per group?
library(dplyr)
df %>% group_by(group) %>% filter(any(diff(which(is.na(D))) > 1))
## A tibble: 8 x 2
## Groups: group [2]
# group D
# <dbl> <dbl>
#1 2. NA
#2 2. 2.
#3 2. NA
#4 2. NA
#5 4. NA
#6 4. 2.
#7 4. 3.
#8 4. NA
I'm not sure this would catch all potential edge cases but it seems to work for the given example.
remove group if any member containes NA in R
We can use filter
after grouping by 'category'
library(dplyr)
tbl %>%
group_by(category) %>%
filter(!any(is.na(values))) %>%
ungroup
-output
# A tibble: 2 x 2
category values
<chr> <dbl>
1 A 2
2 A 3
Exclude groups with NAs in tidy dataset
Using all()
will evaluate the entire group, so you can skip the mutate
step.
MWA %>%
group_by(Dir) %>%
filter(all(!is.na(time_seg)))
# A tibble: 8 x 5
# Groups: Dir [1]
VP Con Dir Seg time_seg
<int> <int> <int> <int> <int>
1 10 2 2 1 320
2 10 2 2 2 1110
3 10 2 2 3 450
4 10 2 2 4 600
5 10 2 2 5 1680
6 10 2 2 6 730
7 10 2 2 7 850
8 10 2 2 8 840
Remove rows with NA in a group, given the group contains at-least one non NA value
We could use data.table
. Convert the 'data.frame' to 'data.table' (setDT(df)
). Grouped by 'class', we check with an if/else
condition about occurrence of 'NA' elements in the 'value' and subset with .SD
library(data.table)
setDT(df)[, if(any(!is.na(value))) .SD[!is.na(value)] else .SD , by = class]
# class value
#1: orange NA
#2: apple 1
#3: grape 1
#4: berry NA
Or we can change the condition from any
to all
by slightly modifying the condition
setDT(df)[, if(all(is.na(value))) .SD else .SD[!is.na(value)], by = class]
# class value
#1: orange NA
#2: apple 1
#3: grape 1
#4: berry NA
Or we get the row index (.I
) and then subset the dataset.
indx <- setDT(df)[, if(any(!is.na(value))) .I[!is.na(value)] else .I, class]$V1
df[indx]
Related Topics
How to Create a Dropdown List in a Shiny Table Using Datatable When Editing the Table
Why Isn't the R Function Sink() Writing a Summary Output to My Results File
How to Sort a Vector of Alphanumeric Values Using Lexical Ordering in R
Dataframe Is Subseted by Row Number and Not by Cell Value After Clicking on Dt::Datatable
Retain Numerical Precision in an R Data Frame
How to Convert a Numeric Value into a Date Value
How to Print on a Serie Sof Graphs Pairwise Comparisons Bars and Effect Size Value
Replace Na with Grouped Means in R
Importing Multiple .CSV Files with Variable Column Types into R
How to Unlock Environment in R
Convert Month's Number to Month Name
Change Line Color Depending on Y Value with Ggplot2
In R, How to Split Timestamp Interval Data into Regular Slots
Use Hooks to Format Table in Output
Web Scraping Data Table with R Rvest
Collapse/Concatenate/Aggregate Multiple Columns to a Single Comma Separated String Within Each Group