Using ifelse() to replace NAs in one data frame by referencing another data frame of different length
Try the following code which takes your original statement and makes a small tweak in the TRUE
argument of the ifelse
function:
> df1$B <- ifelse(is.na(df1$B) == TRUE, df2$B[df2$A %in% df1$A], df1$B)
# Switched '==' to '%in%' ---^
> df1
B C A
1 1.7169811 2012-10-01 0
2 0.3396226 2012-10-01 5
3 4.0000000 2012-10-01 10
4 0.1509434 2012-10-01 15
5 0.0754717 2012-10-01 20
6 20.0000000 2012-10-01 25
7 1.7169811 2012-10-01 0
8 0.3396226 2012-10-01 5
9 5.0000000 2012-10-01 10
10 5.0000000 2012-10-01 15
Creating a function to replace NAs from one data frame with values from another
Functions behave a little differently. It is not a good practice to change dataframes within the function, return the changed dataframe from the function and pass the column name as string.
impute <- function(x) {
df_raw[[x]] <- ifelse(is.na(df_raw[[x]]), miceoutput[[x]][inds],df_raw[[x]])
df_raw
}
df_raw <- impute("PIDS_14")
df_raw
replace NAs of a column with the single value of the same column
This is the typical case for tidyr::fill()
.
library(tidyr)
fill(df, B3:B6, .direction = "updown")
Replace a value in a dataframe by using other matching IDs of another dataframe in R
df1$status[df1$ID %in% df2$ID] <- df2$status[df2$ID %in% df1$ID]
What about this? You only have to fill the condition in the assignment
Spelling correction using a reference in one data frame to fix text in another (r)
With stringr::str_replace_all
you can use a named vector of patterns and replacements:
library(stringr)
df2$result = str_replace_all(string = df2$text, pattern = setNames(df1$fixed_text, nm = df1$old_text))
df2
# text result
# 1 typo1 typo1_fixed
# 2 Hi Hi
# 3 typo2 typo2_fixed
# 4 Bye Bye
# 5 typo3 typo3_fixed
With base R I'd use a for
loop. Your mapply
error is because of a typo (df1$new_text
should be df1$fixed_text
), but addressing that will lead to new errors because of the grepl
... it's hard to have mapply
modify a single column multiple times. But a for
loop is quick to write - see Method 2 below.
If you are searching for exact full-string matches as in this example, you don't need regex at all. You don't need regex to see that "a" == "a"
, you only need regex functions to see that "abc"
contains "a"`. See Method 3 below.
# Method 1
library(stringr)
df2$result1 = str_replace_all(string = df2$text, pattern = setNames(df1$fixed_text, nm = df1$old_text))
# Method 2
df2$result2 = df2$text
for(i in 1:nrow(df1)) {
df2$result2 = gsub(pattern = df1$old_text[i], replacement = df1$fixed_text[i], x = df2$result2)
}
# Method 3
df2$results3 = df2$text
matches = match(df2$text, df1$old_text)
df2$results3[!is.na(matches)] = df1$fixed_text[na.omit(matches)]
df2
# text result1 result2 results3
# 1 typo1 typo1_fixed typo1_fixed typo1_fixed
# 2 Hi Hi Hi Hi
# 3 typo2 typo2_fixed typo2_fixed typo2_fixed
# 4 Bye Bye Bye Bye
# 5 typo3 typo3_fixed typo3_fixed typo3_fixed
(And even if you are searching within strings, if you are doing exact matches without regex special characters you can use the stringr::fixed()
function or the fixed = TRUE
) argument for gsub
to speed things up.)
Related Topics
Problems Using Foreach Parallelization
How to Change the Na Color from Gray to White in a Ggplot Choropleth Map
Dynamic Height and Width for Knitr Plots
How to Get Parameters from Config File in R Script
How to Display Widgets Inline in Shiny
How to Convert Date and Time from Character to Datetime Type
Colorize Clusters in Dendogram with Ggplot2
Plotting Multiple Curves Same Graph and Same Scale
Calculating the Difference Between Consecutive Rows by Group Using Dplyr
How to Plot Logit and Probit in Ggplot2
Convert Column in Data.Frame to Date
Conditional Rolling Mean (Moving Average) on Irregular Time Series
Replace Missing Value with Previous Value
Coding Variable Values into Classes Using R
R - Store a Matrix into a Single Dataframe Cell
How to Neatly Clean My R Workspace While Preserving Certain Objects