How to pass nothing as an argument to `[` for subsetting?
After some poking around, alist
seems to do the trick:
x <- matrix(1:6, nrow=3)
x
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6
# 1st row
do.call(`[`, alist(x, 1, ))
[1] 1 4
# 2nd column
do.call(`[`, alist(x, , 2))
[1] 4 5 6
From ?alist
:
‘alist’ handles its arguments as if they described function
arguments. So the values are not evaluated, and tagged arguments
with no value are allowed whereas ‘list’ simply ignores them.
‘alist’ is most often used in conjunction with ‘formals’.
A way of dynamically selecting which dimension is extracted. To create the initial
alist
of the desired length, see here (Hadley, using bquote
) or here (using alist
).m <- array(1:24, c(2,3,4))
ndims <- 3
a <- rep(alist(,)[1], ndims)
for(i in seq_len(ndims))
{
slice <- a
slice[[i]] <- 1
print(do.call(`[`, c(list(m), slice)))
}
[,1] [,2] [,3] [,4]
[1,] 1 7 13 19
[2,] 3 9 15 21
[3,] 5 11 17 23
[,1] [,2] [,3] [,4]
[1,] 1 7 13 19
[2,] 2 8 14 20
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
How to negate a subsetting argument
With Frank's contribution in the comments, here is a working solution:
testfunc <- function(dfrm, varq, factor, gear = unique(dfrm$gear),
am = unique(dfrm$am), carb = unique(dfrm$carb)){
# Subset the data according to the arguments:
subsetdfrm <- dfrm[which((dfrm[,"gear"] %in% gear) &
(dfrm[,"am"] %in% am) &
(dfrm[,"carb"] %in% carb)),]
# Grab the groups to be compared according to arguments:
factorbinary <- get(factor)
# The t-test:
t <- t.test(dfrm[which(dfrm[factor]==factorbinary[1]), varq],
dfrm[which(dfrm[factor]==factorbinary[2]), varq],
data = subsetdfrm)
print(t)
}
In my original code, instead of dfrm
, I have a filepath that gets imported as dfrm
by read.csv()
. The function seems to have no problem handling the fact that "dfrm" being referred to in the arguments appears later in the course.
R: how to pass a variable into a function to subset data.frame
As @agstudy and @docendodiscimus mentioned in the comments, it is better to use [
, [[
instead of $
when passing column name in functions.
f <- function(x){
subset.19 = dat[,x][dat$age == 20]
subset.20 = dat[,x][dat$age == 19]
t.test(subset.19, subset.20)
}
f("weight")
Why is `[` better than `subset`?
This question was answered in well in the comments by @James, pointing to an excellent explanation by Hadley Wickham of the dangers of subset
(and functions like it) [here]. Go read it!
It's a somewhat long read, so it may be helpful to record here the example that Hadley uses that most directly addresses the question of "what can go wrong?":
Hadley suggests the following example: suppose we want to subset and then reorder a data frame using the following functions:
scramble <- function(x) x[sample(nrow(x)), ]
subscramble <- function(x, condition) {
scramble(subset(x, condition))
}
subscramble(mtcars, cyl == 4)
This returns the error:
Error in eval(expr, envir, enclos) : object 'cyl' not found
because R no longer "knows" where to find the object called 'cyl'. He also points out the truly bizarre stuff that can happen if by chance there is an object called 'cyl' in the global environment:
cyl <- 4
subscramble(mtcars, cyl == 4)
cyl <- sample(10, 100, rep = T)
subscramble(mtcars, cyl == 4)
(Run them and see for yourself, it's pretty crazy.)
do.call( [ ...) function in R
Along with @thelatemail great comment, you can also get more information from the help page help('[')
which reads
indexing by [ is similar to atomic vectors and selects a list of the specified element(s)
and from the help to function do.call
we read
do.call constructs and executes a function call from a name or a function and a list of arguments to be passed to it.
This line is calling the [
function with the list argument dcargs (named because they are do.call arguments). Since the elements of dcargs are indices of the table, what this line is doing is referencing the relevant indices of the list object, contained in [[2]] and [[3]], which it is going to index.
In short, do.call("[",dcargs)
indexes the no and yes rows and the no and yes columns of dcargs[[1]]
.
R how to pass NULL for optional parameters to function (e.g. in for loop)
You can use a list
instead:
formula <- list("~ my_constraint", NULL)
# for (i in formula) print(i)
#[1] "~ my_constraint"
#NULL
If your function takes NULL
as an argument for a function you should also do:
ordination_list <- list()
for (current_formula in formula) {
tmp <- lapply(sample_subset_list,
ordinate,
method = "CCA",
formula = if (is.null(current_formula)) NULL else as.formula(current_formula))
ordination_list[[length(ordination_list) + 1]] <- tmp
}
Subsetting rows by passing an argument to a function
The problem is that you pass the condition as a string and not as a real condition, so R can't evaluate it when you want it to.
if you still want to pass it as string you need to parse and eval it in the right place for example:
cond = eval(parse(text=keep_rows))
raw_data = raw_data[cond,]
This should work, I think
Related Topics
Changing the Symbol in the Legend Key in Ggplot2
Cbind: How to Have Missing Values Set to Na
Convert Quarter/Year Format to a Date
Trouble Passing on an Argument to Function Within Own Function
Extract Part of String Before the First Semicolon
Keeping Only Certain Rows of a Data Frame Based on a Set of Values
Select Rows of a Data.Frame That Contain Only Numbers in a Certain Column
R, Find Duplicated Rows , Regardless of Order
How to Add Only Missing Dates in Dataframe
Create Parametric R Markdown Documentation
How to Connect to a Remote Server with Ssh in R
Existing Function for Seeing If a Row Exists in a Data Frame
Reverse and Change Limit of Axis
Combinations of Multiple Vectors in R