Lapply with Anonymous Function Call to Svytable Results in Object 'X' Not Found

lapply with anonymous function call to svytable results in object 'x' not found

Ok, well, it seems the svytable function is picky and will only look up data in the design object. It doesn't seem to look for x in the enclosing environment. So an alternative approach is to dynamically build the formula. So instead of passing in the columns of data themselves, we pass in names of columns form the data.frame. Then we plug those into the formula and then they are resolved by the design object which points to the original data.frame. Here's a bit of working code using your sample data

lapply(names(dat)[1:9], function(x) round(prop.table(
svytable(bquote(~.(as.name(x)) + seg_2), dat_weight),
2),3)*100)

So here we use bquote to build the formula. The .() allows us to plug in expressions and here we take the character value in x and convert it to a proper name object. Thus is goes from "r3a_9" to r3a_9.

lapply to run to run two anonymous functions simultaneously

The best way would just be to have the function return both results in a list. But also honestly I find it better if things are getting complicated to create the function outside of the lapply. So this is what I would probably do:

myfun <- function(x){
means <- svymean(as.formula(paste0('~interaction(', x, ')')), design, na.rm = T)
table <- svytable(as.formula(paste0('~interaction(', x, ')')), design)
results <- list(svymean = means, svytable = table)
return(results)
}

lapply(vars, myfun)

You obviously could just do this as an anonymous function like...

lapply(vars, function(x){
means <- svymean(as.formula(paste0('~interaction(', x, ')')), design, na.rm = T)
table <- svytable(as.formula(paste0('~interaction(', x, ')')), design)
results <- list(svymean = means, svytable = table)
return(results)
})

You don't necessarily even need to store the intermediate results

lapply(vars, function(x){
list(svymean = svymean(as.formula(paste0('~interaction(', x, ')')), design, na.rm = T), svytable = svytable(as.formula(paste0('~interaction(', x, ')')), design))})

But hopefully you'll agree that isn't pretty.

Sapply with LM returns a Call function that Stargazer can't use. How do I change that?

stargazer has issues while displaying the output for list of models. A "hack" would be to get data in long format before creating the model.

Since there is no data shared here's a way using mtcars dataset keeping only the first 3 columns from it. In place of your x column I am using disp here.

library(stargazer)

df <- mtcars[1:3]
df1 <- tidyr::pivot_longer(df, cols = -disp)
list_df <- split(df1, df1$name)

lm_model_list <- lapply(list_df, function(x) lm(disp~value, x))

Output -

#For one model
stargazer(lm_model_list$cyl, type = 'text')

===============================================
Dependent variable:
---------------------------
disp
-----------------------------------------------
value 62.599***
(5.469)

Constant -156.609***
(35.181)

-----------------------------------------------
Observations 32
R2 0.814
Adjusted R2 0.807
Residual Std. Error 54.385 (df = 30)
F Statistic 130.999*** (df = 1; 30)
===============================================
Note: *p<0.1; **p<0.05; ***p<0.01

#For list of models
stargazer(lm_model_list, type = 'text')

==========================================================
Dependent variable:
----------------------------
disp
(1) (2)
----------------------------------------------------------
value 62.599*** -17.429***
(5.469) (1.993)

Constant -156.609*** 580.884***
(35.181) (41.740)

----------------------------------------------------------
Observations 32 32
R2 0.814 0.718
Adjusted R2 0.807 0.709
Residual Std. Error (df = 30) 54.385 66.863
F Statistic (df = 1; 30) 130.999*** 76.513***
==========================================================
Note: *p<0.1; **p<0.05; ***p<0.01

can lapply not modify variables in a higher scope

I discussed this issue in this related question: "Is R’s apply family more than syntactic sugar". You will notice that if you look at the function signature for for and apply, they have one critical difference: a for loop evaluates an expression, while an apply loop evaluates a function.

If you want to alter things outside the scope of an apply function, then you need to use <<- or assign. Or more to the point, use something like a for loop instead. But you really need to be careful when working with things outside of a function because it can result in unexpected behavior.

In my opinion, one of the primary reasons to use an apply function is explicitly because it doesn't alter things outside of it. This is a core concept in functional programming, wherein functions avoid having side effects. This is also a reason why the apply family of functions can be used in parallel processing (and similar functions exist in the various parallel packages such as snow).

Lastly, the right way to run your code example is to also pass in the parameters to your function like so, and assigning back the output:

mat <- matrix(0,nrow=10,ncol=1)
mat <- matrix(lapply(1:10, function(i, mat) { mat[i,] <- rnorm(1,mean=i)}, mat=mat))

It is always best to be explicit about a parameter when possible (hence the mat=mat) rather than inferring it.

Table in r to be weighted

Try this

GDAtools::wtable(df$sex, df$age, w = df$wgt)

Output

       0-15 16-29 30-44 45+ NA tot
Female 56 73 60 76 0 265
Male 76 99 106 90 0 371
NA 0 0 0 0 0 0
tot 132 172 166 166 0 636

Update

In case you do not want to install the whole package, here are two essential functions you need:

wtable and dichotom

Source them and you should be able to use wtable without any problem.

Error using dynamic variable specification in R survey function svychisq()

It looks like svychisq doesn't evaluate it's first parameter in the same way that svydesign does. The bquote is returning a language object that's not being evaluated into a proper formula. You can call the eval yourself to overcome that issue.

svychisq(eval(bquote(~.(as.name(rowvar)) + .(as.name(colvar)) )), dstrat)
# Pearson's X^2: Rao & Scott adjustment
#
# data: svychisq(eval(bquote(~.(as.name(rowvar)) + .(as.name(colvar)))), dstrat)
# F = 77.2769, ndf = 1, ddf = 197, p-value = 7.364e-16

you could also consider building the formula as a string

svychisq(as.formula(paste("~", rowvar, "+", colvar)), dstrat)

Index based assignment with apply in R

You can use matrix indexing, from ?[:

A third form of indexing is via a numeric matrix with the one column
for each dimension: each row of the index matrix then selects a single
element of the array, and the result is a vector. Negative indices are
not allowed in the index matrix. NA and zero values are allowed: rows
of an index matrix containing a zero are ignored, whereas rows
containing an NA produce an NA in the result.

# construct a matrix representing the index where the value should be one
idx <- with(df, cbind(rep(seq_along(response), lengths(response)), unlist(response)))

idx
# [,1] [,2]
#[1,] 1 1
#[2,] 2 1
#[3,] 2 2
#[4,] 3 2
#[5,] 3 3

# do the assignment
df[idx] <- 1

df
# a b c response
#1 1 0 0 1
#2 1 1 0 1, 2
#3 0 1 1 2, 3


Related Topics



Leave a reply



Submit