Replace Nas in One Variable with Values from Another Variable

Replace NAs in one variable with values from another variable

One way is to use ifelse:

DF <- transform(DF, VAR3 = ifelse(!is.na(VAR1), VAR1, VAR2))

where transform was used to avoid typing DF$ over and over, but maybe you will prefer:

DF$VAR3 <- ifelse(!is.na(DF$VAR1), DF$VAR1, DF$VAR2)

How to replace NAs of a variable with values from another dataframe

Here's a quick solution using data.tables binary join this will join only gender with sex and leave all the rest of the columns untouched

library(data.table)
setkey(setDT(df1), ID)
df1[df2, gender := i.sex][]
# ID gender
# 1: 1 2
# 2: 2 2
# 3: 3 1
# 4: 4 2
# 5: 5 2
# 6: 6 2
# 7: 7 2
# 8: 8 2
# 9: 9 2
# 10: 10 2
# 11: 11 2
# 12: 12 2
# 13: 13 1
# 14: 14 1
# 15: 15 2
# 16: 16 2
# 17: 17 2
# 18: 18 2
# 19: 19 2
# 20: 20 2
# 21: 21 1
# 22: 22 2
# 23: 23 2
# 24: 24 2
# 25: 25 2
# 26: 26 2
# 27: 27 2
# 28: 28 2
# 29: 29 2
# 30: 30 2

How to only replace NA with specific values based on a condition in another variable

Changed the vector to have an %in% statement and added an else statement.

d %>%
mutate(Udd = case_when(is.na(Udd) & Edu < 8 ~ 1,
is.na(Udd) & Edu %in% c(8:11) ~ 2,
is.na(Udd) & Edu > 11 ~ 3,
TRUE ~ Udd))

Creating a function to replace NAs from one data frame with values from another

Functions behave a little differently. It is not a good practice to change dataframes within the function, return the changed dataframe from the function and pass the column name as string.

impute <- function(x) {
df_raw[[x]] <- ifelse(is.na(df_raw[[x]]), miceoutput[[x]][inds],df_raw[[x]])
df_raw
}

df_raw <- impute("PIDS_14")
df_raw

Replace NA with the nearest value based on another variable, while keeping NA for observation which doesn't have non-missing neighbour

One option would be to make use of case_when from tidyverse. Essentially, if the previous row has a closer year and is not NA, then return x from that row. If not, then choose the row below. Or if the year is closer above but there is an NA, then return the row below. Then, same for if the row below has a closer year, but has an NA, then return the row above. If a row does not have an NA, then just return x.

library(tidyverse)

dat %>%
mutate(x = case_when(is.na(x) & !is.na(lag(x)) & year - lag(year) < lead(year) - year ~ lag(x),
is.na(x) & !is.na(lead(x)) & year - lag(year) > lead(year) - year ~ lead(x),
is.na(x) & is.na(lag(x)) ~ lead(x),
is.na(x) & is.na(lead(x)) ~ lag(x),
TRUE ~ x))

Output

   year  x
1 2000 1
2 2001 2
3 2002 3
4 2003 3
5 2005 5
6 2006 5
7 2007 NA
8 2008 9
9 2009 9
10 2010 10

Conditonally replace NA with value from other rows

Your mutate won't work because you did not assign any value to a variable. your mutate() should look like this mutate(value = unique(value[is.na(value)])). Althought this will not be my approach. What I did below was create a look up table of distinct non NA values and then joined them onto the original dataset. valuedis should be the values you want.

temporal <- c("Monday", "Monday", "Tuesday", "Tuesday","Wednesday", "Wednesday", "Thursday", "Thursday", "Friday", "Friday","Monday", "Monday", "Tuesday", "Tuesday","Wednesday", "Wednesday", "Thursday", "Thursday", "Friday", "Friday")
spatial <- c("North", "South","North", "South","North", "South","North", "South","North", "South", "North", "South","North", "South","North", "South","North", "South","North", "South")
value <- c(NA,2,3,4,5,6,7,NA,9,10,1,NA,3,4,5,6,7,8,9,NA)

df <- as.data.frame(cbind(temporal, spatial, value))

library(dplyr)

dfdis <- df %>%
filter(!is.na(value)) %>%
distinct(temporal,spatial,value) %>%
rename(valuedis = value)

df2 <- left_join(df,dfdis, by = c("temporal","spatial"))

Replace a value NA with the value from another column in R

Perhaps the easiest to read/understand answer in R lexicon is to use ifelse. So borrowing Richard's dataframe we could do:

df <- structure(list(A = c(56L, NA, NA, 67L, NA),
B = c(75L, 45L, 77L, 41L, 65L),
Year = c(1921L, 1921L, 1922L, 1923L, 1923L)),.Names = c("A",
"B", "Year"), class = "data.frame", row.names = c(NA, -5L))
df$A <- ifelse(is.na(df$A), df$B, df$A)

Replace NA by value of another variable

You can use sapply in base R:

mydat[,c("X5","X6")] <- with(mydat, sapply(mydat[8:9],function(x) ifelse(is.na(X6),X5,X6)))

Giving the desired solution:

  ItemRelation DocumentNum CalendarYear X1 X2 X3 X4  X5  X6
1 158200 1715 2018 0 0 0 NA 107 107
2 158204 1715 2018 0 0 0 NA 105 105

Explanation:

ifelse examines whether the X6 value for a given row is NA, and if so, selects the value of X5 from that row. If X6 is not NA, then just X6 is used.

sapply allows you to quickly apply this ifelse function to every row of your data.frame.

with changes the environment so that you're "within" your mydat object so that you can refer to its parts without using $ or [].

Replacing NAs in a column with the values of other column

You can use coalesce:

library(dplyr)

df1 <- data.frame(Letters, Char, stringsAsFactors = F)

df1 %>%
mutate(Char1 = coalesce(Char, Letters))

Letters Char Char1
1 A a a
2 B b b
3 C <NA> C
4 D d d
5 E <NA> E

Replace missing values (NA) in one data set with values from another where columns match

I would do this:

library(data.table)
setDT(DF1); setDT(DF2)

DF1[DF2, x := ifelse(is.na(x), i.x, x), on=c("y","z")]

which gives

     x y z
1: 153 a 1
2: 163 b 1
3: 184 d 1
4: 123 a 2
5: 145 e 2
6: 176 c 2
7: 124 b 1
8: 199 a 2

Comments. This approach isn't so great, since it merges the whole of DF1, while we only need to merge the subset where is.na(x). Here, the improvement looks like (thanks, @Arun):

DF1[is.na(x), x := DF2[.SD, x, on=c("y", "z")]]

This way is analogous to @RHertel's answer.


From @Jakob's comment:

does this work for more than one x variable? If I want to fill up entire datasets with several columns?

You can enumerate the desired columns:

DF1[DF2, `:=`(
x = ifelse(is.na(x), i.x, x),
w = ifelse(is.na(w), i.w, w)
), on=c("y","z")]

The expression could be constructed using lapply and substitute, probably, but if the set of columns is fixed, it might be cleanest just to write it out as above.



Related Topics



Leave a reply



Submit