Remove rows with all or some NAs (missing values) in data.frame
Also check complete.cases
:
> final[complete.cases(final), ]
gene hsap mmul mmus rnor cfam
2 ENSG00000199674 0 2 2 2 2
6 ENSG00000221312 0 1 2 3 2
na.omit
is nicer for just removing all NA
's. complete.cases
allows partial selection by including only certain columns of the dataframe:
> final[complete.cases(final[ , 5:6]),]
gene hsap mmul mmus rnor cfam
2 ENSG00000199674 0 2 2 2 2
4 ENSG00000207604 0 NA NA 1 2
6 ENSG00000221312 0 1 2 3 2
Your solution can't work. If you insist on using is.na
, then you have to do something like:
> final[rowSums(is.na(final[ , 5:6])) == 0, ]
gene hsap mmul mmus rnor cfam
2 ENSG00000199674 0 2 2 2 2
4 ENSG00000207604 0 NA NA 1 2
6 ENSG00000221312 0 1 2 3 2
but using complete.cases
is quite a lot more clear, and faster.
How to remove row if it has a NA value in one certain column
The easiest solution is to use is.na()
:
df[!is.na(df$B), ]
which gives you:
A B C
1 NA 2 NA
2 1 2 3
4 1 2 3
How can I delete all rows of a DataFrame that have an NA in a specific column?
I don't know whether what follows is the most elegant way of deleting all rows having an NA
in a specific column, but that is one way.
Generating a toy DataFrame
julia> df = DataFrame(A = 1:10, B = 2:2:20)
10x2 DataFrame
| Row | A | B |
|-----|----|----|
| 1 | 1 | 2 |
| 2 | 2 | 4 |
| 3 | 3 | 6 |
| 4 | 4 | 8 |
| 5 | 5 | 10 |
| 6 | 6 | 12 |
| 7 | 7 | 14 |
| 8 | 8 | 16 |
| 9 | 9 | 18 |
| 10 | 10 | 20 |
julia> df[[1,4,8],symbol("B")] = NA
NA
julia> df
10x2 DataFrame
| Row | A | B |
|-----|----|----|
| 1 | 1 | NA |
| 2 | 2 | 4 |
| 3 | 3 | 6 |
| 4 | 4 | NA |
| 5 | 5 | 10 |
| 6 | 6 | 12 |
| 7 | 7 | 14 |
| 8 | 8 | NA |
| 9 | 9 | 18 |
| 10 | 10 | 20 |
Filtering out rows whose "B"
-column element is NA
julia> df[~isna(df[:,symbol("B")]),:]
7x2 DataFrame
| Row | A | B |
|-----|----|----|
| 1 | 2 | 4 |
| 2 | 3 | 6 |
| 3 | 5 | 10 |
| 4 | 6 | 12 |
| 5 | 7 | 14 |
| 6 | 9 | 18 |
| 7 | 10 | 20 |
julia> df
10x2 DataFrame
| Row | A | B |
|-----|----|----|
| 1 | 1 | NA |
| 2 | 2 | 4 |
| 3 | 3 | 6 |
| 4 | 4 | NA |
| 5 | 5 | 10 |
| 6 | 6 | 12 |
| 7 | 7 | 14 |
| 8 | 8 | NA |
| 9 | 9 | 18 |
| 10 | 10 | 20 |
Deleting rows whose "B"
-column element is NA
julia> deleterows!(df,find(isna(df[:,symbol("B")])))
7x2 DataFrame
| Row | A | B |
|-----|----|----|
| 1 | 2 | 4 |
| 2 | 3 | 6 |
| 3 | 5 | 10 |
| 4 | 6 | 12 |
| 5 | 7 | 14 |
| 6 | 9 | 18 |
| 7 | 10 | 20 |
julia> df
7x2 DataFrame
| Row | A | B |
|-----|----|----|
| 1 | 2 | 4 |
| 2 | 3 | 6 |
| 3 | 5 | 10 |
| 4 | 6 | 12 |
| 5 | 7 | 14 |
| 6 | 9 | 18 |
| 7 | 10 | 20 |
Python - Drop row if two columns are NaN
Any one of the following two:
df.dropna(subset=[1, 2], how='all')
or
df.dropna(subset=[1, 2], thresh=1)
Remove rows whose number is NA in R
You could subset with the help of is.na()
:
f <- f[!is.na(f$Current_status) & !is.na(f$Start_date), ]
Related Topics
Wrap Long Axis Labels Via Labeller=Label_Wrap in Ggplot2
Gradient of N Colors Ranging from Color 1 and Color 2
Editing Legend (Text) Labels in Ggplot
Calculating Statistics on Subsets of Data
Ggplot2 Change Axis Limits For Each Individual Facet Panel
Multiple Use of the Positional '$' Operator to Update Nested Arrays
How to Replace Na Values in a Table For Selected Columns
How to Change the Default Library Path For R Packages
How to Put Labels Over Geom_Bar For Each Bar in R With Ggplot2
Use a Value from the Previous Row in an R Data.Table Calculation
Proper/Fastest Way to Reshape a Data.Table
As.Date With Dates in Format M/D/Y in R