Replace values in data frame based on other data frame in R
Use match
:
userdata$ID <- userids$ID[match(userdata$ID, userids$USER)]
userdata$FRIENDID <- userids$ID[match(userdata$FRIENDID, userids$USER)]
Replace values in a dataframe by values of other dataframe
Call first sample df old_df, call second new_df. It sounds like essentially you want to update rows in new_df with values from old_df, retaining all non-matching rows in new_df:
library(dplyr)
new_df %>% rows_update(old_df, by = "ID")
Gives:
# A tibble: 9 x 5
ID a b c d
<dbl> <chr> <dbl> <chr> <dbl>
1 1 hi 1 ri 2
2 2 ho 1 ro 2
3 3 NA NA NA NA
4 4 hu 1 ru 2
5 5 ha 1 NA NA
6 6 NA NA NA NA
7 7 he 1 re 2
8 10 hii 1 NA NA
9 11 hoo 1 roo 2
R How can I replace values with values of another dataframe?
sample data:
df = data.frame(a = c(1,1,2,3,3,3), b = rep('val1', 6), c = rep('val2', 6))
df
# a b c
# 1 1 val1 val2
# 2 1 val1 val2
# 3 2 val1 val2
# 4 3 val1 val2
# 5 3 val1 val2
# 6 3 val1 val2
using dplyr
's recode()
, you can achieve this:
df %>% mutate(a = recode(a, '1' = 'cat', '2' = 'dog', '3' = 'rabbit'))
# a b c
# 1 cat val1 val2
# 2 cat val1 val2
# 3 dog val1 val2
# 4 rabbit val1 val2
# 5 rabbit val1 val2
# 6 rabbit val1 val2
Replace values in one column based on part of text in another dataframe in R
This seems to be a case for fuzzy_join
with regex_left_join
. After the regex_left_join
, coalecse
the columns together so that it will return the first non-NA element per each row
library(fuzzyjoin)
library(dplyr)
regex_left_join(df1, df2, by = 'Supplier') %>%
transmute(Supplier = coalesce(New_Supplier, Supplier.x), Value)
-output
Supplier Value
1 AAA 100
2 Red 200
3 Red 300
4 DDD 400
5 Blue 200
6 Blue 100
7 Green 200
8 HHH 40
9 III 150
10 JJJ 70
Replace Dataframe column with another dataframe based on conditions - R
I think you can use the following solution:
library(dplyr)
df1 %>%
left_join(df2, by = c("ID1", "ID2")) %>%
mutate(VALUE1.x = ifelse(ID1 == 5 & ID2 < 100, VALUE1.y, VALUE1.x)) %>%
select(-VALUE1.y) %>%
rename_with(~ sub("\\.x", "", .), contains(".x"))
ID1 ID2 VALUE1 NAME SURNAME
1 1 10 100 Juan perez
2 2 20 200 Rodrigo jones
3 3 30 300 Pedro bla
4 4 40 400 Lucas lopez
5 5 50 40 d martinez
6 5 150 100 e rodriguez
7 5 200 200 f jerez
8 4 99 40 g dieguez
9 3 10 150 x gimenez
10 5 25 200 a mendez
Replace all values in dataframe using another dataframe as key in R
An option is match
the elements with the 'Cell_ID' of second dataset and use that as index to return the corresponding 'value' from 'df2'
library(dplyr)
df1 %>%
mutate(across(everything(), ~ df2$value[match(., df2$Cell_ID)]))
-output
# Cell_ID n_1 n_2 n_3 n_4 n_5 n_6 n_7
#1 700 5 900 1000 NA NA NA NA
#2 200 5 100 400 500 700 900 1000
#3 300 5 400 500 NA NA NA NA
#4 1000 5 100 200 400 600 800 300
Or another option is to use a named vector to do the match
library(tibble)
df1 %>%
mutate(across(everything(), ~ deframe(df2)[as.character(.)]))
The base R
equivalent is
df1[] <- lapply(df1, function(x) df2$value[match(x, df2$Cell_ID)])
Replace value in data frame with value from other data frame based on set of conditions
Using the data.table
package:
# load the 'data.table' package
library(data.table)
# convert the data.frame's to data.table's
setDT(df1)
setDT(df2)
# update df1 by reference with a join with df2
df1[df2[, correct := 0], on = .(ID, cond, block, correct), msec := i.mean]
which gives:
> df1
ID cond block correct msec
1: rs 1 2 1 456
2: rs 1 2 0 545
3: rs 2 4 1 756
4: tr 1 2 1 654
5: tr 1 2 1 625
6: tr 2 4 0 765
Note: The above code will update df1
instead of creating a new dataframe, which is more memory-efficient.
Replace specific values based on another dataframe
You could use the join functionality of the data.table-package for this:
library(data.table)
setDT(DF1)
setDT(DF2)
DF1[DF2, on = .(date, id), `:=` (city = i.city, sales = i.sales)]
which gives:
> DF1
date id sales cost city
1: 06/19/2016 1 9999 101 LON
2: 06/20/2016 1 150 102 MTL
3: 06/21/2016 1 151 104 MTL
4: 06/22/2016 1 152 107 MTL
5: 06/23/2016 1 155 99 MTL
6: 06/19/2016 2 84 55 NY
7: 06/20/2016 2 83 55 NY
8: 06/21/2016 2 80 56 NY
9: 06/22/2016 2 777 57 QC
10: 06/23/2016 2 555 58 QC
When you have many columns in both datasets, it is easier to use mget
instead off typing all the column names. For the used data in the question it would look like:
DF1[DF2, on = .(date, id), names(DF2)[3:4] := mget(paste0("i.", names(DF2)[3:4]))]
When you want to construct a vector of columnnames that need to be added beforehand, you could do this as follows:
cols <- names(DF2)[3:4]
DF1[DF2, on = .(date, id), (cols) := mget(paste0("i.", cols))]
Related Topics
Dplyr - Mutate Dynamically Named Variables Using Other Dynamically Named Variables
Loop Through a Series of Qplots
Store Arrangegrob to Object, Does Not Create Printable Object
R - Svd() Function - Infinite or Missing Values in 'X'
Write Different Data Frame in One .CSV File with R
Repeat Vector to Fill Down Column in Data Frame
Are Eigenvectors Returned by R Function Eigen() Wrong
Splitting String Between Capital and Lowercase Character in R
Delete Rows with Less Than 7 Characters
Linking Intel's Math Kernel Library (Mkl) to R on Windows
Use Dplyr to Concatenate a Column
Extracting Data from Text Files
Convert Table into Matrix by Column Names
Find the Source File Containing R Function Definition
Knitr Compile Problems with Rstudio (Windows)