Use Lapply for Multiple Regression with Formula Changing, Not the Dataset

Use lapply for multiple regression with formula changing, not the dataset

This should work:

fit <- lapply(myvars, function(dvar) lm(eval(paste0(dvar,' ~ wt')), data = Boston))

Combining lapply and lm to run a regression for each date of panel-dataset

It's a little hard to test without all data but does this work?? The split function creates a list of data frames, one for each date and the map_df function iterates over each df in the list returning a df for each model.

Data %>% 
split(.$Date) %>%
map_df(~ {an.lm <- lm(R ~ DEF+TERM+SIZE+MOMENTUM+Liquidity+Duartion+VOLA+Rating, data = .x)

the.coefficients <- an.lm$coef
the.results <- as.data.frame(cbind(year(.x$Date), t(the.coefficients)))
the.results
}
)

Can I fit different regression models using mapply?

You don't need to evaluate a string here. You can pass formula as string in lm :

reg<- function(dependent,independent) {
lm(paste0(dependent,"~",independent),data=iris)
}

Another way to construct the formula is using reformulate :

reg<-function(dependent,independent) {
lm(reformulate(independent, dependent),data=iris)
}

Now you can call using Map :

Map(reg, dependent, independent)

#$Sepal.Length

#Call:
#lm(formula = reformulate(independent, dependent), data = iris)

#Coefficients:
#(Intercept) Sepal.Width
# 6.5262 -0.2234

#$Sepal.Width

#Call:
#lm(formula = reformulate(independent, dependent), data = iris)

#Coefficients:
# (Intercept) Sepal.Length
# 3.41895 -0.06188

How to choose variable pairs for regression analysis using lapply

Instead of working on the columns of a, you could operate on the names of the variables you want to process:

set.seed(144)
varnames<-c( "id", "a", "b", "c", "a.0", "b.0", "c.0", "d", "e", "f")
a <- as.data.frame (matrix (round (rnorm (40, mean = 5, sd = 3), 1), 4, 10))
colnames(a) <- varnames
model <- lapply(varnames[2:4], function(x) {
gee(paste0(x, "~", x, ".0+d+e+f"), data=a, id=id)
})

Error using lapply to perform regression: variable lengths differ (found for 'x')

This is because glm does not actually change the x in the formula to the variable passed as x to your function. Assuming fsdata_dict1 contains the names of the variables you wish to include (as character strings), you can do something like:

glm_func <- function(x){
f <- as.formula(paste("binge_dr ~", x, "+ Age_in_Yrs + Race + SSAGA_Income + SSAGA_Educ"))
return(glm(f, family = binomial, data = fdata_fs, na.action = na.pass))
}

lapply(fsdata_dict1, glm_func)

Na.action = exclude error in lapply

You have your parameters in the wrong order. To see the issue better try spacing your code out like this:

varlist<-names(df)[2:504]

models <- lapply(varlist, function(x) {
lm(
substitute(i ~ rmrf + smb + hml,
na.action = na.exclude,
list(i = as.name(x))
),
data = df)
})

Notice how in the example that works

lm(Var2 ~ rmrf + smb + hml, na.action = na.exclude, data = df)

the na.action parameter is inside the lm function? In the example that doesnt work it is inside the substitute function

robust linear regression with lapply

Your call is missing the data argument. lapply will call FUN with each member of the list as the first argument of FUN but data is the second argument to rlm.

The solution is to define an anonymous function.

lin_mod <- lapply(lst1, function(DF) MASS::rlm(formula = var1 ~ var2, data = DF))
summary(lin_mod[[1]])
#
#Call: rlm(formula = var1 ~ var2, data = DF)
#Residuals:
# Min 1Q Median 3Q Max
#-18.707 -5.381 1.768 6.067 7.511
#
#Coefficients:
# Value Std. Error t value
#(Intercept) 19.6977 1.0872 18.1179
#var2 0.0092 0.0002 38.2665
#
#Residual standard error: 8.827 on 98 degrees of freedom

Easily performing the same regression on different datasets

Or use update

(fit <- lm(mpg ~ wt, data = mtcars))

# Call:
# lm(formula = mpg ~ wt, data = mtcars)
#
# Coefficients:
# (Intercept) wt
# 37.285 -5.344

update(fit, data = mtcars[mtcars$hp < 100, ])

# Call:
# lm(formula = mpg ~ wt, data = mtcars[mtcars$hp < 100, ])
#
# Coefficients:
# (Intercept) wt
# 39.295 -5.379

update(fit, data = mtcars[1:10, ])

# Call:
# lm(formula = mpg ~ wt, data = mtcars[1:10, ])
#
# Coefficients:
# (Intercept) wt
# 33.774 -4.285


Related Topics



Leave a reply



Submit