Returning Anonymous Functions from Lapply - What Is Going Wrong

Returning anonymous functions from lapply - what is going wrong?

R passes promises, not the values themselves. The promise is forced when it is first evaluated, not when it is passed, and by that time the index has changed if one uses the code in the question. The code can be written as follows to force the promise at the time the outer anonymous function is called and to make it clear to the reader:

pl <- lapply(1:3, function(y) { force(y); function(x) pow(x,y) } )

Anonymous function in lapply

Instead of going over the data frame directly you could switch things around and have lapply go over a vector of the column names,

data(iris)

lapply(colnames(iris), function(x) c(class(iris[[x]]), x))

or over an index for the columns, referencing the data frame.

lapply(1:ncol(iris), function(x) c(class(iris[[x]]), names(iris[x])))

Notice the use of both single and double square brackets.

iris[[n]] references the values of the nth object in the list iris (a data frame is just a particular kind of list), stripping all attributes, making something like mean(iris[[1]]) possible.

iris[n] references the nth object itself, all attributes intact, making something like names(iris[1]) possible.

Using anonymous functions with lapply/sapply in R?

You have a few options here, let's start with aggregate - not what you asked for but I want to keep your attention high ;)

aggregate(saleprice ~ car_names, cars1, max)
#  car_names saleprice
#1  Corvette     72000
#2      Golf     22000
#3    Malibu     33000

Returns a data.frame (which you can easily split if you need a list)

aggregate is similar to tapply coming next

tapply(cars1$saleprice, cars1$car_names, FUN = max)
#Corvette     Golf   Malibu 
#   72000    22000    33000

Or try by and which.max

by(cars1, cars1$car_names, FUN = function(x) x[which.max(x$saleprice), ])
#cars1$car_names: Corvette
#  car_names saleprice
#3  Corvette     72000
#-------------------------------
#cars1$car_names: Golf
#  car_names saleprice
#6      Golf     22000
#-------------------------------
#cars1$car_names: Malibu
#  car_names saleprice
#7    Malibu     33000

Finally, you can use also lapply and split (for which by is somewhat shorthand)

lapply(split(cars1, cars1$car_names), function(x) x[which.max(x$saleprice), ])
#$Corvette
#  car_names saleprice
#3  Corvette     72000

#$Golf
#  car_names saleprice
#6      Golf     22000

#$Malibu
#  car_names saleprice
#7    Malibu     33000

error with lapply on anonymous ggplot function

You are missing a paranthesis to close ggtitle() as names() also requires a closing paren.

if (inputMethodP == "WITHINFILE") {
  par(mfrow = c(5, listPortions))
  plotList <- lapply(RangeStatResultP, function(listPart) {
    ggplot(matrixPart, aes(x = factor(Var2), y = value)) +
      geom_violin()+
      ggtitle(names(listPart)) +
      xlab(listnum) +
      ylab("Coverage") +
      stat_summary(fun.y = median, geom = "point",
                   position = position_dodge(width = .9), size = 6, shape = 4,
                   show_guide = F)
  })
}

Why does this lapply() function applied to a dataframe not produce the same results as its for-loop equivalent?

To make your lapply code work just replace <- with <<-:

DF_1[row, column] <<- DF_1[row, column] + val

Please see ?assignOps for more info.

However, again I wouldn't recommend lapply in this case (<<- should be avoided in general)

Here is a data.table approach:

library(data.table)

DT <- setDT(data.frame(
  ID = c(1,1,1,2,2,2,3,3,3),
  Period = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
  Values = c(5, 10, 15, 50, 2, 4, 3, 6, 9),
  Flags = c("X1","X0","X2","X0","X2","X0", "X2","X1","X1")
))

unique_flags <- unique(DT$Flags)
all_flags <- setDT(expand.grid(list(first_flag = unique_flags, last_flag = unique_flags)))

resultDT <- dcast(
  data = DT[, .(first_flag = first(Flags), last_flag = last(Flags), first_value = first(Values)), by = ID][
  all_flags, on = c("first_flag", "last_flag")],
  last_flag ~ first_flag,
  fun.aggregate = sum,
  value.var = "first_value"
  )

for (col_i in seq_len(ncol(resultDT))){
  set(resultDT, which(is.na(resultDT[[col_i]])), col_i, 0)
}
print(resultDT)

Result:

   last_flag X0 X1 X2
1:        X0 50  0  0
2:        X1  0  0  3
3:        X2  0  5  0

# step by step ------------------------------------------------------------
library(data.table)

DT <- setDT(data.frame(
  ID = c(1,1,1,2,2,2,3,3,3,4,4,4),
  Period = c(1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3),
  Values = c(5, 10, 15, 50, 2, 4, 3, 6, 9, 3, 6, 9),
  Flags = c("X1","X0","X2","X0","X2","X0", "X2","X1","X1", "X2","X1","X1")
))

unique_flags <- unique(DT$Flags)
all_flags <- setDT(expand.grid(list(first_flag = unique_flags, last_flag = unique_flags)))

resultDT <- DT[, .(first_flag = first(Flags), last_flag = last(Flags), first_value = first(Values)), by = ID] # find relevant flags
resultDT <- resultDT[all_flags, on = c("first_flag", "last_flag")] # merge all combinations
resultDT <- dcast(resultDT, last_flag ~ first_flag, fun.aggregate = sum, value.var = "first_value") # dcast
for (col_i in seq_len(ncol(resultDT))){
  set(resultDT, which(is.na(resultDT[[col_i]])), col_i, 0)
}
print(resultDT)

lapply with anonymous function call to svytable results in object 'x' not found

Ok, well, it seems the svytable function is picky and will only look up data in the design object. It doesn't seem to look for x in the enclosing environment. So an alternative approach is to dynamically build the formula. So instead of passing in the columns of data themselves, we pass in names of columns form the data.frame. Then we plug those into the formula and then they are resolved by the design object which points to the original data.frame. Here's a bit of working code using your sample data

lapply(names(dat)[1:9], function(x) round(prop.table(
    svytable(bquote(~.(as.name(x)) + seg_2), dat_weight),
2),3)*100)

So here we use bquote to build the formula. The .() allows us to plug in expressions and here we take the character value in x and convert it to a proper name object. Thus is goes from "r3a_9" to r3a_9.

How to use ‘apply’ with function that returns wrong dimensions (R)

Maybe you can get some inspiration from the following example.

The anonymous function uses hard-coded function dummy and another function, the argument fun set equal to log.

dummy <- function(x){
  a <- max(x)
  b <- min(x)
  c(a, b)
}

z <- matrix(c(2,3,4,5,6,7,8,9,10), nrow = 3)
apply(z, 2, dummy)

apply(z, 2, function(x, fun){
    y <- dummy(x)
    y <- fun(y[2])
    y
}, fun = log)
#[1] 0.6931472 1.6094379 2.0794415

Another possibility would be to have itworks need two arguments, the vector x and a function chosen on a case by case basis.

itworks <- function(x, fun){
  qq <- quantile(x)
  fun(qq[2])
}

apply(z, 2, itworks, fun = sin)
#[1]  0.5984721 -0.7055403  0.7984871

The point here is that the arguments a function takes can be anything, including other functions. apply has 3 mandatory arguments, X, MARGIN and FUN, plus the dots argument. In the dots argument you can pass whatever you want as long as FUN accepts it.

lapply-ing with the $ function

This is documented in ?lapply, in the "Note" section (emphasis mine):

For historical reasons, the calls created by lapply are unevaluated,
and code has been written (e.g. bquote) that relies on this. This
means that the recorded call is always of the form FUN(X[[0L]], ...), with 0L replaced by the current integer index. This is not
normally a problem, but it can be if FUN uses sys.call or
match.call or if it is a primitive function that makes use of the
call. This means that it is often safer to call primitive functions
with a wrapper, so that e.g. lapply(ll, function(x) is.numeric(x))
is required in R 2.7.1 to ensure that method dispatch for is.numeric
occurs correctly.

Returning Anonymous Functions from Lapply - What Is Going Wrong