Conditional merge/replacement, by multiple columns
I'm not sure if you would consider this 'smarter', but here is a way to do it with just one join call:
library(dplyr)
left_join(df1, df2, by = c('x1', 'x2')) %>%
mutate(x3 = if_else(is.na(x3.y), x3.x, x3.y)) %>%
select(-x3.y, -x3.x)
x1 x2 x3
1 1 a xx
2 1 b b
3 2 a c
4 2 b zz
Merge Row into one with condition and replace value in one row with value in the other R
A data.table
option
setDT(df)[
,
c(
lapply(
setNames(.(A, B), c("A", "B")),
function(x) if ("Winter" %in% D) replace(x, D == "Summer", x[D == "Winter"]) else x
),
.(D = D)
),
C
][
,
lapply(.SD, function(x) toString(unique(x))),
C
][,
.SD,
.SDcols = names(df)
]
gives
A B C D
1: X apple december Winter, Summer
2: Z apple june Winter, Summer
3: U pear march Summer
Data
> dput(df)
structure(list(A = c("X", "Y", "Z", "W", "U"), B = c("apple",
"pear", "apple", "pear", "pear"), C = c("december", "december",
"june", "june", "march"), D = c("Winter", "Summer", "Winter",
"Summer", "Summer")), class = "data.frame", row.names = c(NA,
-5L))
How to conditionally replace R data.table columns upon merge?
We can use the on
based approach
dt1[dt2, column1 := i.column1, on = .(index_column)]
dt1
# index_column column1 column2
#1: 12 dog 482
#2: 17 cat 391
#3: 29 penguin 567
#4: 34 elephant 182
#5: 46 bird 121
Conditionally merge rows
Here is how we could do it:
Credits to MartinGal for the regex "(?<=[A-Z])[A-Z]+") (upvote!)
- Replace empty values with
NA
- Use
lead
to move rows up inX3
conditional onNA
else not - filter if is not
NA
inX1
- Extract the important information with
str_extract
and regex"(?<=[A-Z])[A-Z]+"
-> combine this info with columnX2
withstr_c
and finallycoalesce
both. - Remove the string to keep relevant one with regex and
str_remove
library(dyplr)
library(stringr)
df %>%
mutate(across(everything(), ~sub("^\\s*$", NA, .)),
X3= ifelse(is.na(X3), lead(X2), X3)) %>%
filter(!is.na(X1)) %>%
mutate(X2 = coalesce(str_c(X2," ", str_extract(X3, "(?<=[A-Z])[A-Z]+")), X2),
X3 = str_remove_all(X3, "(?<=[A-Z])[A-Z]+"))
Output:
X1 X2 X3
1 111 House M. Bab A
2 2 House M. Cac A - C
3 121 Street M. Bak D
4 121 House M. Aba SMITH A
5 141 Garden Harris WHITE A - B
6 141 Villa Thomas BURNEY B - D
How to merge two dataframes in R conditionally (common column, condition)
Here's how to do it with dplyr
.
inner_join(X[,1:3],Y, by=c("Tab.No"))%>%
mutate(AC.Name = ifelse(Survey.Date>=Survey.Start.Date & Survey.Date<=Survey.End.Date, AC.Name ,NA),
Mandal.Name = ifelse(Survey.Date>=Survey.Start.Date & Survey.Date<=Survey.End.Date, Mandal.Name ,NA),
Village.Name = ifelse(Survey.Date>=Survey.Start.Date & Survey.Date<=Survey.End.Date, Village.Name ,NA))%>%
group_by(Tab.No)%>%
filter(!is.na(AC.Name)|n()==1)%>%
select(Response.No,Tab.No,Survey.Date,AC.Name,Mandal.Name,Village.Name)
result
Response.No Tab.No Survey.Date AC.Name Mandal.Name Village.Name
(int) (int) (date) (chr) (chr) (chr)
1 9530 1 2015-05-26 Nandigama Chanderlapadu Punnavalli
2 6702 1 2015-05-30 Nandigama Chanderlapadu Kasarabada
3 26744 1 2015-05-31 Nandigama Chanderlapadu Kasarabada
4 8925 1 2015-06-03 Nandigama Chanderlapadu Kasarabada
5 20242 1 2015-06-04 Nandigama Chanderlapadu Kasarabada
6 21316 1 2015-06-04 Nandigama Chanderlapadu Kasarabada
7 28056 1 2015-06-04 Nandigama Chanderlapadu Kasarabada
8 12661 1 2015-06-05 Nandigama Chanderlapadu Kasarabada
9 17187 1 2015-06-05 Nandigama Chanderlapadu Kasarabada
10 28795 1 2015-06-05 Nandigama Chanderlapadu Kasarabada
data
X<-read.table(text=" Response.No Tab.No Survey.Date AC.Name Mandal.Name Village.Name
9530 1 2015-05-26 NA NA NA
6702 1 2015-05-30 NA NA NA
26744 1 2015-05-31 NA NA NA
8925 1 2015-06-03 NA NA NA
20242 1 2015-06-04 NA NA NA
21316 1 2015-06-04 NA NA NA
28056 1 2015-06-04 NA NA NA
12661 1 2015-06-05 NA NA NA
17187 1 2015-06-05 NA NA NA
28795 1 2015-06-05 NA NA NA
", header=T,stringsAsFactors =F)
Y<-read.table(text="AC.Name Mandal.Name Village.Name Tab.No Survey.Start.Date Survey.End.Date
Nandigama Chanderlapadu Punnavalli 1 2015-05-23 2015-05-27
Nandigama Chanderlapadu Kasarabada 1 2015-05-30 2015-06-07
Nandigama Chanderlapadu Kodavatikallu 1 2015-06-09 2015-06-28
Nandigama Chanderlapadu Thurlapadu 1 2015-06-29 2015-07-13
Nandigama Chanderlapadu Chanderlapadu 1 2015-07-14 2015-07-25
Nandigama Chanderlapadu Popuru 2 2015-05-23 2015-05-27
Nandigama Chanderlapadu Kandrapadu 2 2015-05-30 2015-06-08
Nandigama Chanderlapadu Vibhareethalapadu 3 2015-05-30 2015-06-04
Nandigama Chanderlapadu Eturu 3 2015-06-10 2015-06-23
Nandigama Chanderlapadu Bobbillapadu 3 2015-06-26 2015-07-03
", header=T,stringsAsFactors =F)
X$Survey.Date <-as.Date(X$Survey.Date)
Y$Survey.Start.Date <-as.Date(Y$Survey.Start.Date)
Y$Survey.End.Date <-as.Date(Y$Survey.End.Date)
Merging dataframes and replacing values with multiple conditions in R (1 0 NA)
I think you can get this with a simple pmax
(parallel maximum). It most naturally works on matrices, not data frames. Using @R Schifini's data:
pmax(as.matrix(df1), as.matrix(df2), na.rm = T)
# d1 d2 d3
# [1,] 0 1 1
# [2,] 0 1 0
# [3,] 0 0 0
# [4,] 1 0 NA
Related Topics
Pasting Two Vectors With Combinations of All Vectors' Elements
How to Merge 2 Vectors Alternating Indexes
Read All Files in Directory and Apply Multiple Functions to Each Data Frame
Find Which Season a Particular Date Belongs To
How to Efficiently Calculate Distance Between Pair of Coordinates Using Data.Table :=
How to Use Grep()/Gsub() to Find Exact Match
Special Variables in Ggplot (..Count.., ..Density.., etc.)
Create a Data.Frame Where a Column Is a List
Ggplot2 Keep Unused Levels Barplot
R Ifelse to Replace Values in a Column
Difference: "Compile Pdf" Button in Rstudio Vs. Knit() and Knit2Pdf()