How to Pass "Nothing" as an Argument to '[' for Subsetting

How to pass nothing as an argument to `[` for subsetting?

After some poking around, alist seems to do the trick:

x <- matrix(1:6, nrow=3)
x
[,1] [,2]
[1,] 1 4
[2,] 2 5
[3,] 3 6

# 1st row
do.call(`[`, alist(x, 1, ))
[1] 1 4

# 2nd column
do.call(`[`, alist(x, , 2))
[1] 4 5 6

From ?alist:

‘alist’ handles its arguments as if they described function
arguments. So the values are not evaluated, and tagged arguments
with no value are allowed whereas ‘list’ simply ignores them.
‘alist’ is most often used in conjunction with ‘formals’.


A way of dynamically selecting which dimension is extracted. To create the initial alist of the desired length, see here (Hadley, using bquote) or here (using alist).

m <- array(1:24, c(2,3,4))
ndims <- 3
a <- rep(alist(,)[1], ndims)
for(i in seq_len(ndims))
{
slice <- a
slice[[i]] <- 1
print(do.call(`[`, c(list(m), slice)))
}

[,1] [,2] [,3] [,4]
[1,] 1 7 13 19
[2,] 3 9 15 21
[3,] 5 11 17 23

[,1] [,2] [,3] [,4]
[1,] 1 7 13 19
[2,] 2 8 14 20

[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6

How to negate a subsetting argument

With Frank's contribution in the comments, here is a working solution:

testfunc <- function(dfrm, varq, factor, gear = unique(dfrm$gear),
am = unique(dfrm$am), carb = unique(dfrm$carb)){
# Subset the data according to the arguments:
subsetdfrm <- dfrm[which((dfrm[,"gear"] %in% gear) &
(dfrm[,"am"] %in% am) &
(dfrm[,"carb"] %in% carb)),]

# Grab the groups to be compared according to arguments:
factorbinary <- get(factor)

# The t-test:
t <- t.test(dfrm[which(dfrm[factor]==factorbinary[1]), varq],
dfrm[which(dfrm[factor]==factorbinary[2]), varq],
data = subsetdfrm)
print(t)
}

In my original code, instead of dfrm, I have a filepath that gets imported as dfrm by read.csv(). The function seems to have no problem handling the fact that "dfrm" being referred to in the arguments appears later in the course.

R: how to pass a variable into a function to subset data.frame

As @agstudy and @docendodiscimus mentioned in the comments, it is better to use [, [[ instead of $ when passing column name in functions.

 f <- function(x){
subset.19 = dat[,x][dat$age == 20]
subset.20 = dat[,x][dat$age == 19]
t.test(subset.19, subset.20)
}
f("weight")

Why is `[` better than `subset`?

This question was answered in well in the comments by @James, pointing to an excellent explanation by Hadley Wickham of the dangers of subset (and functions like it) [here]. Go read it!

It's a somewhat long read, so it may be helpful to record here the example that Hadley uses that most directly addresses the question of "what can go wrong?":

Hadley suggests the following example: suppose we want to subset and then reorder a data frame using the following functions:

scramble <- function(x) x[sample(nrow(x)), ]

subscramble <- function(x, condition) {
scramble(subset(x, condition))
}

subscramble(mtcars, cyl == 4)

This returns the error:

Error in eval(expr, envir, enclos) : object 'cyl' not found

because R no longer "knows" where to find the object called 'cyl'. He also points out the truly bizarre stuff that can happen if by chance there is an object called 'cyl' in the global environment:

cyl <- 4
subscramble(mtcars, cyl == 4)

cyl <- sample(10, 100, rep = T)
subscramble(mtcars, cyl == 4)

(Run them and see for yourself, it's pretty crazy.)

do.call( [ ...) function in R

Along with @thelatemail great comment, you can also get more information from the help page help('[') which reads

indexing by [ is similar to atomic vectors and selects a list of the specified element(s)

and from the help to function do.call we read

do.call constructs and executes a function call from a name or a function and a list of arguments to be passed to it.

This line is calling the [ function with the list argument dcargs (named because they are do.call arguments). Since the elements of dcargs are indices of the table, what this line is doing is referencing the relevant indices of the list object, contained in [[2]] and [[3]], which it is going to index.

In short, do.call("[",dcargs) indexes the no and yes rows and the no and yes columns of dcargs[[1]].

R how to pass NULL for optional parameters to function (e.g. in for loop)

You can use a list instead:

formula <- list("~ my_constraint", NULL)

# for (i in formula) print(i)
#[1] "~ my_constraint"
#NULL

If your function takes NULL as an argument for a function you should also do:

ordination_list <- list()
for (current_formula in formula) {
tmp <- lapply(sample_subset_list,
ordinate,
method = "CCA",
formula = if (is.null(current_formula)) NULL else as.formula(current_formula))
ordination_list[[length(ordination_list) + 1]] <- tmp
}

Subsetting rows by passing an argument to a function

The problem is that you pass the condition as a string and not as a real condition, so R can't evaluate it when you want it to.

if you still want to pass it as string you need to parse and eval it in the right place for example:

cond = eval(parse(text=keep_rows))
raw_data = raw_data[cond,]

This should work, I think



Related Topics



Leave a reply



Submit