How to Delete Rows from a Dataframe That Contain N*Na

Remove rows with all or some NAs (missing values) in data.frame

Also check complete.cases :

> final[complete.cases(final), ]
gene hsap mmul mmus rnor cfam
2 ENSG00000199674 0 2 2 2 2
6 ENSG00000221312 0 1 2 3 2

na.omit is nicer for just removing all NA's. complete.cases allows partial selection by including only certain columns of the dataframe:

> final[complete.cases(final[ , 5:6]),]
gene hsap mmul mmus rnor cfam
2 ENSG00000199674 0 2 2 2 2
4 ENSG00000207604 0 NA NA 1 2
6 ENSG00000221312 0 1 2 3 2

Your solution can't work. If you insist on using is.na, then you have to do something like:

> final[rowSums(is.na(final[ , 5:6])) == 0, ]
gene hsap mmul mmus rnor cfam
2 ENSG00000199674 0 2 2 2 2
4 ENSG00000207604 0 NA NA 1 2
6 ENSG00000221312 0 1 2 3 2

but using complete.cases is quite a lot more clear, and faster.

How to remove row if it has a NA value in one certain column

The easiest solution is to use is.na():

df[!is.na(df$B), ]

which gives you:

   A B  C
1 NA 2 NA
2 1 2 3
4 1 2 3

How can I delete all rows of a DataFrame that have an NA in a specific column?

I don't know whether what follows is the most elegant way of deleting all rows having an NA in a specific column, but that is one way.

Generating a toy DataFrame

julia> df = DataFrame(A = 1:10, B = 2:2:20)
10x2 DataFrame
| Row | A | B |
|-----|----|----|
| 1 | 1 | 2 |
| 2 | 2 | 4 |
| 3 | 3 | 6 |
| 4 | 4 | 8 |
| 5 | 5 | 10 |
| 6 | 6 | 12 |
| 7 | 7 | 14 |
| 8 | 8 | 16 |
| 9 | 9 | 18 |
| 10 | 10 | 20 |

julia> df[[1,4,8],symbol("B")] = NA
NA

julia> df
10x2 DataFrame
| Row | A | B |
|-----|----|----|
| 1 | 1 | NA |
| 2 | 2 | 4 |
| 3 | 3 | 6 |
| 4 | 4 | NA |
| 5 | 5 | 10 |
| 6 | 6 | 12 |
| 7 | 7 | 14 |
| 8 | 8 | NA |
| 9 | 9 | 18 |
| 10 | 10 | 20 |

Filtering out rows whose "B"-column element is NA

julia> df[~isna(df[:,symbol("B")]),:]
7x2 DataFrame
| Row | A | B |
|-----|----|----|
| 1 | 2 | 4 |
| 2 | 3 | 6 |
| 3 | 5 | 10 |
| 4 | 6 | 12 |
| 5 | 7 | 14 |
| 6 | 9 | 18 |
| 7 | 10 | 20 |

julia> df
10x2 DataFrame
| Row | A | B |
|-----|----|----|
| 1 | 1 | NA |
| 2 | 2 | 4 |
| 3 | 3 | 6 |
| 4 | 4 | NA |
| 5 | 5 | 10 |
| 6 | 6 | 12 |
| 7 | 7 | 14 |
| 8 | 8 | NA |
| 9 | 9 | 18 |
| 10 | 10 | 20 |

Deleting rows whose "B"-column element is NA

julia> deleterows!(df,find(isna(df[:,symbol("B")])))
7x2 DataFrame
| Row | A | B |
|-----|----|----|
| 1 | 2 | 4 |
| 2 | 3 | 6 |
| 3 | 5 | 10 |
| 4 | 6 | 12 |
| 5 | 7 | 14 |
| 6 | 9 | 18 |
| 7 | 10 | 20 |

julia> df
7x2 DataFrame
| Row | A | B |
|-----|----|----|
| 1 | 2 | 4 |
| 2 | 3 | 6 |
| 3 | 5 | 10 |
| 4 | 6 | 12 |
| 5 | 7 | 14 |
| 6 | 9 | 18 |
| 7 | 10 | 20 |

Python - Drop row if two columns are NaN

Any one of the following two:

df.dropna(subset=[1, 2], how='all')

or

df.dropna(subset=[1, 2], thresh=1)

Remove rows whose number is NA in R

You could subset with the help of is.na():

f <- f[!is.na(f$Current_status) & !is.na(f$Start_date), ]


Related Topics



Leave a reply



Submit