When Trying to Replace Values, "Missing Values Are Not Allowed in Subscripted Assignments of Data Frames"

When trying to replace values, missing values are not allowed in subscripted assignments of data frames

You can use ifelse, like so

pe94.person$foo <- ifelse(!is.na(pe94.person$H01) & pe94.person$H01 == 12, 0, pe94.person$H03)

check if foo meets your criteria and then go ahead and assign it to pe94.person$H03 directly. I find it safer to assign it a new variable and usually use that in subsequent analysis.

Replace specific value in R

From your error message, it's either your dat$i_huvisfatin_v00 column doesn't contain the value 16527.98, or it already have NA in the column.

dat$i_huvisfatin_v00 == 16527.98 returns a logical vector, which cannot be treated as index itself if it contains NA. Use which() in the row index seems to solve the problem.

dat[which(dat$i_huvisfatin_v00 == 16527.98), "i_huvisfatin_v00"] <- NA

NAs are not allowed in subscripted assignments

You're trying to assign to these three rows:

> match(dat$code, age.and.sex$code)
[1] 1 2 NA

because dat$code and age.and.sex$code are not the same length, so the third comparison is NA.

I'm not sure what you actually mean to be matching, but you might just try subsetting to the first two observations, or na.omit, etc.

But a better way to join data from two tables is to use a join.

library(data.table)
dat <- data.table(dat)
setkey(dat,code)
age.and.sex <- data.table(age.and.sex)
setkey(age.and.sex,code)
dat[age.and.sex]
> dat[age.and.sex]
code age sex more i.age i.sex
1: A11 NA m 7 15 m
2: B22 NA f 4 10 f

Note how the columns of the inner table get appended to those of the outer table.

More... Per @joran's suggestion...you can use this technique to fill in missing observations:

joined <- dat[age.and.sex]
joined[is.na(age),age:=i.age] #only replace the value missing from left table
joined[,c("i.age","i.sex"):=NULL]
joined
> joined
code age sex more
1: A11 15 m 7
2: B22 10 f 4

Update to address your comment...just reverse the join. There are some cleverer ways to do this less manually, but this should be simple to follow:

joined <- age.and.sex[dat]
joined[is.na(age),age:=i.age]
joined[is.na(sex),sex:=i.sex]
joined[,c("i.age","i.sex"):=NULL]
> joined
code age sex more
1: A11 15 m 7
2: B22 10 f 4
3: C33 12 m 9

If this technique is to your liking you should definitely read ?data.table and the related vignette to learn more about joins.

How to fulfill missing cells of a data frame in R?

If you want to modify all cells that are not 20, including other valid values for age, I would do the following:

# Creating a data frame with another valid age
df = data.frame( name= c("Tommy", "John", "Dan","Bob"), age = c(20, NA, NA,12) )

# Substitute values different than 20 for 15
df[df$age!=20 | is.na(df$age),"age"] <- 15

name age
1 Tommy 20
2 John 15
3 Dan 15
4 Bob 15

NAs are not allowed in subscripted assignments

Your logic will need to also exclude NAs in the subset. See the following example. Note the subsets vectors are stored away before x is modified.

x <- c(1,3,5,7,NA,2,4,6)
subset1 <- x>=5 & !is.na(x)
subset2 <- x<5 & !is.na(x)

x[subset1] <- which(subset1)
x[subset2] <- 10*which(subset2)

Fill in missing values in column with a different dataframe

One option is a join with data.table

library(data.table)
setDT(df1)[df2, Date := i.Date, on = .(Alphabet)]
df1
# Alphabet Date Colour
#1: ABC 2018-09-10 green
#2: DEF 2017-06-11 red
#3: GHI 2016-05-12 blue
#4: JKL 2017-06-07 yellow
#5: MNO 2018-08-03 orange
#6: PQR 2019-10-07 brown

Update

Using the new 'df2n' dataset

i1 <- is.na(df1$Date)|df1$Date %in% "Unknown"
setDT(df1)[df2n[df2n$Alphabet %in% df1$Alphabet[i1],],
Date := i.Date, on = .(Alphabet)]
df1
# Alphabet Date Colour
#1: ABC 2018-09-10 green
#2: DEF 2017-06-11 red
#3: GHI 2016-05-12 blue
#4: JKL 2017-06-07 yellow
#5: MNO 2018-08-03 orange
#6: PQR 2019-10-07 brown

Or using match from base R

i1 <- match(df2$Alphabet, df1$Alphabet)
df1$Date[i1] <- df2$Date

data

df1 <- structure(list(Alphabet = c("ABC", "DEF", "GHI", "JKL", "MNO", 
"PQR"), Date = c("2018-09-10", "2017-06-11", "2016-05-12", NA,
NA, "Unknown"), Colour = c("green", "red", "blue", "yellow",
"orange", "brown")), class = "data.frame", row.names = c(NA,
-6L))

df2 <- structure(list(Alphabet = c("JKL", "MNO", "PQR"), Date = c("2017-06-07",
"2018-08-03", "2019-10-07")), class = "data.frame", row.names = c(NA,
-3L))

df2a <- structure(list(Alphabet = c("JKL", "MNO", "PQR", "STU", "VWX"
), Date = c("2017-06-07", "2018-08-03", "2019-10-07", "2019-11-08",
"2019-12-08")), class = "data.frame", row.names = c(NA, -5L))

Error when replacing a value in a data frame with NAs

The first problem is to specify the variable name correctly, that is with the name and not the value (probably just a typo in your question): "y" and not "yes".

Then another problem arises when you use == and it tries to think of what to do with the NA in the third row:

x=="NS"
[1] TRUE TRUE NA

hmm, should it be kept or not ? It is neither TRUE nor FALSE... so it just gives an error as it cannot "decide".

While, using %in% (which is actually match(x, table, nomatch = 0)), we get:

x %in% "NS"
[1] TRUE TRUE FALSE

There you go, NA doesn't match the value "NS" so it returns 0, or, in logical, FALSE : we shouldn't keep it.

Thus, to get what you want:

z[z$x %in% "NS", "y"] <- "a"
z
# x y
#1 NS a
#2 NS a
#3 <NA> b


Related Topics



Leave a reply



Submit