The rules of subsetting
In addition to your nice solution using merge
(thanks for that, I always forget merge
), this can be achieved in base using ?interaction
as follows. There may be other variations of this, but this is the one I am familiar with:
> df1[interaction(df1) %in% interaction(df2), ]
Now to answer your question: First, I think there's a typo (corrected) in:
df1[ df1$z %in% df2$c | df2$b == 9,] # second part should be df2$b == 9
You would get an error, because the first part evaluates to
[1] TRUE TRUE TRUE TRUE TRUE
and the second evaluates to:
[1] FALSE FALSE FALSE FALSE
You do a |
operation on unequal lengths getting the error:
longer object length is not a multiple of shorter object length
Edit: If you have multiple columns then you can choose the interaction as such. For example, if you want to get from df1
the rows where the first two columns match with that of df2
, then you could simply do:
> df1[interaction(df1[, 1:2]) %in% interaction(df2[, 1:2]), ]
subset a-rules in R by length of lhs
length
gives you the number of rules. You need to use size
instead.
subset(rules,subset = size(lhs) == 5)
Dynamically subset a data.frame by a list of rules
It's not very generalized--I mean each element will be and
s and each of those elements in each element will be or
s, but that's what your question asks.
df <- data.frame(col1 = c('a','s','x'),
col2 = c('a','z','s'),
col3 = c('a','c','b'),
stringsAsFactors = FALSE)
df[with(df, col1 == 's'
& col2 == 'z'
& (col3 == 'a' | col3 == 'b' | col3 == 'c')), ]
# col1 col2 col3
# 2 s z c
rules <- list(col1 = c('s'), col2 = c('z'), col3 = c('a', 'b', 'c'))
df[Reduce(`&`, Map(`%in%`, df, rules)), ]
# col1 col2 col3
# 2 s z c
magic
magic <- function(data, rules) {
data[Reduce(`&`, Map(`%in%`, data, rules)), ]
}
magic(df, rules)
# col1 col2 col3
# 2 s z c
Edit -- version 2
This one should work for 1) columns without rules and/or 2) rules not in the exact order of columns
magic <- function(data, rules) {
rules <- rules[names(data)]
idx <- Map(`%in%`, data, rules)
idx[is.na(names(rules))] <- list(rep(TRUE, nrow(data)))
data[Reduce(`&`, idx), ]
}
df <- data.frame(col1 = c('a','s','x'),
col2 = c('a','z','s'),
colx = rnorm(3),
col3 = c('a','c','b'),
stringsAsFactors = FALSE)
rules <- list(col2 = c('z'), col1 = c('s'), col3 = c('a', 'b', 'c'))
magic(df, rules)
# col1 col2 colx col3
# 2 s z -1.374339 c
more tests
magic(mtcars, list(gear = 4, carb = 1:2))
# mpg cyl disp hp drat wt qsec vs am gear carb
# Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
# Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
# Merc 230 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
# Fiat 128 32.4 4 78.7 66 4.08 2.200 19.47 1 1 4 1
# Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
# Toyota Corolla 33.9 4 71.1 65 4.22 1.835 19.90 1 1 4 1
# Fiat X1-9 27.3 4 79.0 66 4.08 1.935 18.90 1 1 4 1
# Volvo 142E 21.4 4 121.0 109 4.11 2.780 18.60 1 1 4 2
R arules - subset of transactions that match a rule
Actually the subset syntax in the context of arules is very similar to any other context: you may want to try the following:
subset(transactions, items %in% lhs(r) & !items %in% rhs(r) )
I hope this helps!
Related Topics
Ggplot with Customized Font Not Showing Properly on Shinyapps.Io
How to Use a Character Vector of Column Names in the Formula Argument of Dcast (Reshape2)
Substitute a for B and B for a in a String
R Obtaining Rownames Date Using Quantmod
How to Use Write.Table() and Ddply, Together
How to Rename All Columns of a Data Frame Based on Another Data Frame in R
Shutdown Windows After Simulation
How to Convert a Character String Date to Date Class If Day Value Is Missing
Avoid Ggplot2 to Partially Cut Axis Text
Difference of Prediction Results in Random Forest Model
Findassocs for Multiple Terms in R
Match Dataframes Excluding Last Non-Na Value and Disregarding Order
Merging Data.Tables Based on Columns Names
"Could Not Find Function" in Roxygen Examples During Cmd Check
Importing Multiple Excel Files with Filenames in R
Understanding Ddply Error Message - Argument "By" Is Missing, with No Default