Using Ifelse() to Replace Nas in One Data Frame by Referencing Another Data Frame of Different Length

Using ifelse() to replace NAs in one data frame by referencing another data frame of different length

Try the following code which takes your original statement and makes a small tweak in the TRUE argument of the ifelse function:

> df1$B <- ifelse(is.na(df1$B) == TRUE, df2$B[df2$A %in% df1$A], df1$B)   
# Switched '==' to '%in%' ---^
> df1
B C A
1 1.7169811 2012-10-01 0
2 0.3396226 2012-10-01 5
3 4.0000000 2012-10-01 10
4 0.1509434 2012-10-01 15
5 0.0754717 2012-10-01 20
6 20.0000000 2012-10-01 25
7 1.7169811 2012-10-01 0
8 0.3396226 2012-10-01 5
9 5.0000000 2012-10-01 10
10 5.0000000 2012-10-01 15

Creating a function to replace NAs from one data frame with values from another

Functions behave a little differently. It is not a good practice to change dataframes within the function, return the changed dataframe from the function and pass the column name as string.

impute <- function(x) {
df_raw[[x]] <- ifelse(is.na(df_raw[[x]]), miceoutput[[x]][inds],df_raw[[x]])
df_raw
}

df_raw <- impute("PIDS_14")
df_raw

replace NAs of a column with the single value of the same column

This is the typical case for tidyr::fill().

library(tidyr)

fill(df, B3:B6, .direction = "updown")

Replace a value in a dataframe by using other matching IDs of another dataframe in R

df1$status[df1$ID %in% df2$ID] <- df2$status[df2$ID %in% df1$ID]

What about this? You only have to fill the condition in the assignment

Spelling correction using a reference in one data frame to fix text in another (r)

With stringr::str_replace_all you can use a named vector of patterns and replacements:

library(stringr)
df2$result = str_replace_all(string = df2$text, pattern = setNames(df1$fixed_text, nm = df1$old_text))
df2
# text result
# 1 typo1 typo1_fixed
# 2 Hi Hi
# 3 typo2 typo2_fixed
# 4 Bye Bye
# 5 typo3 typo3_fixed

With base R I'd use a for loop. Your mapply error is because of a typo (df1$new_text should be df1$fixed_text), but addressing that will lead to new errors because of the grepl... it's hard to have mapply modify a single column multiple times. But a for loop is quick to write - see Method 2 below.

If you are searching for exact full-string matches as in this example, you don't need regex at all. You don't need regex to see that "a" == "a", you only need regex functions to see that "abc" contains "a"`. See Method 3 below.

# Method 1
library(stringr)
df2$result1 = str_replace_all(string = df2$text, pattern = setNames(df1$fixed_text, nm = df1$old_text))

# Method 2
df2$result2 = df2$text
for(i in 1:nrow(df1)) {
df2$result2 = gsub(pattern = df1$old_text[i], replacement = df1$fixed_text[i], x = df2$result2)
}

# Method 3
df2$results3 = df2$text
matches = match(df2$text, df1$old_text)
df2$results3[!is.na(matches)] = df1$fixed_text[na.omit(matches)]

df2
# text result1 result2 results3
# 1 typo1 typo1_fixed typo1_fixed typo1_fixed
# 2 Hi Hi Hi Hi
# 3 typo2 typo2_fixed typo2_fixed typo2_fixed
# 4 Bye Bye Bye Bye
# 5 typo3 typo3_fixed typo3_fixed typo3_fixed

(And even if you are searching within strings, if you are doing exact matches without regex special characters you can use the stringr::fixed() function or the fixed = TRUE) argument for gsub to speed things up.)



Related Topics



Leave a reply



Submit