Subsetting Rows in R producing NAs, but there are no NAs in Data Frame
We can specify the row with the logical expression, subset the columns with the column names as strings, get the unique
and extract the distance
unique(x[x$component ==1, c("ObjectID", "distance")])$distance
#[1] 2 4
If the intention is only to get the 'distance' based on the 'unique' values of 'ObjectID', we can use duplicated
with(subset(x, component == 1, select = c(ObjectID, distance)),
distance[!duplicated(ObjectID)])
#[1] 2 4
Or more compactly, join two conditions with &
subset(x, !duplicated(ObjectID) & component == 1)$distance
#[1] 2 4
The issue in OP's code is using the unique
value of 'ObjectID' as row index, which fails as the index can be either logical or numeric index
unique(x[x$component==1,]$ObjectID)
#[1] "11AD1234" "11DA354"
If we have to convert this to logical, we can use %in%
Subsetting rows in R generates mysterious NA row [Version 2.0]
using your example (which doesnt show any NAs because you forgot to reassign the variable):
iris
iris$Petal.Width <- gsub(1.8, NA, iris$Petal.Width)
iris[!is.na(iris$Petal.Width) & iris$Petal.Width == 2.0,]
this also works:
iris[complete.cases(iris$Petal.Width) & iris$Petal.Width== 2 ,]
which gives the following output:
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
111 6.5 3.2 5.1 2 virginica
114 5.7 2.5 5.0 2 virginica
122 5.6 2.8 4.9 2 virginica
123 7.7 2.8 6.7 2 virginica
132 7.9 3.8 6.4 2 virginica
148 6.5 3.0 5.2 2 virginica
read those links as an introduction to NAs in R:
http://www.statmethods.net/input/missingdata.html
http://www.ats.ucla.edu/stat/r/faq/missing.htm
Subsetting R data frame with NAs in index variable
I think this does what you wanted:
> a[(a$Diab != 0) | is.na(a$Diab),]
Diab INF HYP
2 NA 1 0
3 1 1 1
4 1 1 0
5 1 1 NA
8 NA 0 1
9 NA 1 1
You need to find entries in Diab
which are either not equal to zero (!= 0
) or equal to NA
(is.na
). The boolean operator |
means OR
.
Subset in R producing na
This is a workaround, a response to your #2
Looking at your code, there is a much easier way of subsetting data. Try this.
Check if this solves your issue.
library(dplyr)
active<- clinic %>%
filter(Days.since.injury.physio>20,
Days.since.injury.physio<35,
Days.since.injury.F.U.1>27,
Days.since.injury.F.U.1<63
)
dplyr does wonders when it comes to subsetting and manipulation of data.
The %>%
symbol chains statements together so you don't ever have to use the $
symbol.
If, for some bizarre reason, you don't like this, you should look at the subset function in r.
subsetting !is.na for multiple conditions unexpected results
I don't know why the initial approach didn't work, but I guess there is some fault in the chaining that I can not see. Taking the opposite approach (removing those that fulfills the condition) seems to produce the desired output.
tmp <- data.frame(state = c(1, 1, 2, 2, 3, 3, 4, 5),
reg = c(NA, 3, 6, NA, 9, 1, NA, 7),
gas = c(NA, 5, NA, 9, 1, 3, NA, 1),
other = c(1, 2, 4, 2, 6, 8, 1, 1) )
res = tmp[-which(is.na(tmp$reg) & is.na(tmp$gas)),]
res
#> state reg gas other
#> 2 1 3 5 2
#> 3 2 6 NA 4
#> 4 2 NA 9 2
#> 5 3 9 1 6
#> 6 3 1 3 8
#> 8 5 7 1 1
Created on 2020-12-24 by the reprex package (v0.3.0)
Related Topics
How to Calculate the Co-Occurrence in the Table
What Does .Sd Stand For in Data.Table in R
Editing Legend (Text) Labels in Ggplot
How to Read a CSV File in R With Different Number of Columns
What's Wrong With My Function to Load Multiple .Csv Files into Single Dataframe in R Using Rbind
Generate a Sequence of the Last Day of the Month Over Two Years
Dplyr: How to Use Group_By Inside a Function
Converting Decimal to Binary in R
Remove Columns With Zero Values from a Dataframe
How to Display the Frequency At the Top of Each Factor in a Barplot in R
Alternate, Interweave or Interlace Two Vectors
Cumulatively Paste (Concatenate) Values Grouped by Another Variable
Create New Dummy Variable Columns from Categorical Variable
How to Assign from a Function Which Returns More Than One Value