Replacing Na Values from Another Dataframe by Id

How to replace NAs of a variable with values from another dataframe

Here's a quick solution using data.tables binary join this will join only gender with sex and leave all the rest of the columns untouched

library(data.table)
setkey(setDT(df1), ID)
df1[df2, gender := i.sex][]
# ID gender
# 1: 1 2
# 2: 2 2
# 3: 3 1
# 4: 4 2
# 5: 5 2
# 6: 6 2
# 7: 7 2
# 8: 8 2
# 9: 9 2
# 10: 10 2
# 11: 11 2
# 12: 12 2
# 13: 13 1
# 14: 14 1
# 15: 15 2
# 16: 16 2
# 17: 17 2
# 18: 18 2
# 19: 19 2
# 20: 20 2
# 21: 21 1
# 22: 22 2
# 23: 23 2
# 24: 24 2
# 25: 25 2
# 26: 26 2
# 27: 27 2
# 28: 28 2
# 29: 29 2
# 30: 30 2

Replace NAs in dataframe with values from second dataframe based on multiple criteria

You can create a unique key to update df2.

unique_key1 <- paste(df1$A, df1$B)
unique_key2 <- paste(df2$A, df2$B)
inds <- is.na(df2$C)
df2$C[inds] <- df1$C[match(unique_key2[inds], unique_key1)]
df2

# A B C E
#1 20210901 15:00 74 A 74
#2 20210903 17:00 27 C 27
#3 20210904 18:00 60 D 60
#4 20210906 20:00 7 F 7
#5 20210907 21:00 96 G 96
#6 20210908 22:00 98 H 98
#7 20210909 23:00 38 I 38
#8 20210910 00:00 89 J 89
#9 20210912 02:00 69 L 69
#10 20210913 03:00 72 M 72
#11 20210914 04:00 76 N 76
#12 20210915 05:00 63 O 63
#13 20210916 06:00 13 P 13
#14 20210918 08:00 25 R 25
#15 20210919 09:00 92 S 92
#16 20210920 10:00 21 T 21
#17 20210921 11:00 79 U 79
#18 20210922 12:00 41 V 41
#19 20210924 14:00 97 X 97
#20 20210925 15:00 16 Y 16

data

cbind creates a matrix, use data.frame to create dataframes.

df1 <- data.frame(A, B, C, D)
df2 <- data.frame(A, B, C, E)

Replacing dataframe value given multiple condition from another dataframe with R

You could use

library(purrr)
library(dplyr)

df1 %>%
mutate(
across(
starts_with("sampl"),
~imap_dbl(.x, ~ifelse(is.null(df2[.y, .x]), NA_real_, df2[.y, .x])),
.names = "{.col}_snow"
),
.keep = "unused"
)

to get

  CellID sampl1_snow sampl2_snow sampl3_snow
1 1 0.1 0.4 0.6
2 2 0.1 0.5 0.7
3 3 0.1 0.9 0.9
4 4 0.5 NA NA
5 5 NA NA NA
6 6 NA NA NA

Data

For df2 I used

structure(list(CellID = c(1, 2, 3, 4, 5, 6), oct = c(0.1, 0.1, 
0.1, 0.1, 0.1, 0.1), nov = c(0.4, 0.5, 0.4, 0.5, 0.6, 0.5), dec = c(0.6,
0.7, 0.8, 0.7, 0.6, 0.8), jan = c(0, 0, 0.9, 0, 0, 0)), class = "data.frame", row.names = c(NA,
-6L))

Replace matching values from one dataframe with index value from another dataframe

TRY:

df1['fruit'] = df1.fruit.map(dict(df2[['fruit','id']].values))

replace values across columns in a dataframe when index variable matches to another dataframe in r

An option is also to loop across the column names from 'df2' in df1, match the 'ID' and coalesce with the original column values

library(dplyr)
df1 %>%
mutate(across(any_of(names(df2)[-1]),
~ coalesce(df2[[cur_column()]][match(ID, df2$ID)], as.character(.x))))

-output

  ID var1 var2 var3   var4  var5 var6
1 a 40 fish 9 pencil lamp 0
2 b 22 55 18 11 -2 1
3 a 12 fish 81 pencil lamp 0
4 d 4 pig 3 pen rug 1
5 e 0 0 0 0 0 0
6 d 2 pig 2 pen rug 2
7 f 1 cow 1 eraser couch 1


Related Topics



Leave a reply



Submit