Extract the Coefficients for the Best Tuning Parameters of a Glmnet Model in Caret

Extract the coefficients for the best tuning parameters of a glmnet model in caret

After a bit of playing with your code I find it very odd that glmnet train chooses different lambda ranges depending on the seed. Here is an example:

library(caret)
library(glmnet)
set.seed(13)
model.test <- caret::train(asthma ~ age + gender + bmi_p + m_edu + p_edu + f_color, data = x, method = "glmnet", 
                           family = "binomial", trControl = fitControl, tuneGrid = tuneGrid, 
                           metric = "ROC")

c(head(model.test$finalModel$lambda, 5), tail(model.test$finalModel$lambda, 5))
#output
 [1] 3.7796447301 3.4438715094 3.1379274562 2.8591626295 2.6051625017 0.0005483617 0.0004996468 0.0004552595 0.0004148155
[10] 0.0003779645

optimum lambda is:

model.test$finalModel$lambdaOpt
#output
#[1] 0.05

and this works:

coef(model.test$finalModel, model.test$finalModel$lambdaOpt)
#12 x 1 sparse Matrix of class "dgCMatrix"
                        1
(Intercept)   -0.03158974
age            0.03329806
genderX1      -1.24093677
bmi_p          1.65156913
m_eduX1        0.45314106
m_eduX2       -0.09934991
m_eduX3       -0.72360297
p_eduX1       -0.51949828
p_eduX2       -0.80063642
p_eduX3       -2.18231433
f_colorred     0.87618211
f_coloryellow -1.52699254

giving the coefficients at best alpha and lambda

when using this model to predict some y are predicted as X1 and some as X2

 [1] X1 X1 X0 X1 X1 X0 X0 X1 X1 X1 X0 X1 X1 X1 X0 X0 X0 X1 X1 X1 X1 X0 X1 X1
Levels: X0 X1

now with the seed you used

set.seed(1352)
model.test <- caret::train(asthma ~ age + gender + bmi_p + m_edu + p_edu + f_color, data = x, method = "glmnet", 
                           family = "binomial", trControl = fitControl, tuneGrid = tuneGrid, 
                           metric = "ROC")

c(head(model.test$finalModel$lambda, 5), tail(model.test$finalModel$lambda, 5))
#output
 [1] 2.699746e-01 2.459908e-01 2.241377e-01 2.042259e-01 1.860830e-01 3.916870e-05 3.568906e-05 3.251854e-05 2.962968e-05
[10] 2.699746e-05

lambda values are 10 times smaller and this gives empty coefficients since lambdaOpt is not in the range of tested lambda:

coef(model.test$finalModel, model.test$finalModel$lambdaOpt)
#output
12 x 1 sparse Matrix of class "dgCMatrix"
              1
(Intercept)   .
age           .
genderX1      .
bmi_p         .
m_eduX1       .
m_eduX2       .
m_eduX3       .
p_eduX1       .
p_eduX2       .
p_eduX3       .
f_colorred    .
f_coloryellow .

model.test$finalModel$lambdaOpt
#output
0.5

now when predicting upon this model only X0 is predicted (the first level):

predict(model.test, x)
#output
 [1] X0 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0 X0
Levels: X0 X1

quite odd behavior, probably worth reporting

get coefficients of cross-validated glmnet model in caret

I think you forgot to show the code for sample_df. However, assuming the following you can access it as follows:

library(caret)
x = matrix(rnorm(500), ncol=10)
y = rnorm(100)
sample_df = cbind.data.frame(y,x)
control = trainControl(
  method="LOOCV",
  allowParallel = TRUE,
  number = nrow(sample_df), 
  verboseIter = FALSE,
  returnData = FALSE
)

my_elasticnet <- train(sample_df[2:11], sample_df$y,
                       method = "glmnet",
                       preProc = c("center", "scale"),
                       trControl = control)

my_elasticnet$finalModel$beta

If you just look at names, you'll get everything you need regarding the final model:

> names(my_elasticnet$finalModel)
 [1] "a0"          "beta"        "df"          "dim"         "lambda"      "dev.ratio"   "nulldev"    
 [8] "npasses"     "jerr"        "offset"      "call"        "nobs"        "lambdaOpt"   "xNames"     
[15] "problemType" "tuneValue"   "obsLevels"   "param"

EDIT: In Response to comment

The final model depends what level of alpha and lambda you select. There are 66 such such parameters. If you want to choose the one the machine thinks is best, you can do:

coef(my_elasticnet$finalModel, my_elasticnet$bestTune$lambda)

That will give you just a 11x1 vector that you're looking for.

Trying to extract coefficients from glmnet model returns NULL or type must be either raw or prob error

To get the coefficients from the best model, you can use:

coef(model_glmnet_pca$finalModel, model_glmnet_pca$finalModel$lambdaOpt)

See e.g. this link on using regularised regression models with caret.

r: coefficients from glmnet and caret are different for the same lambda

The reason is the fact the exact lambda you specified was not used by caret. You can check this by:

ridge_full$finalModel$lambda

closest values are 261.28915 and 238.07694.

When you do

coef(ridge_full$finalModel, s = ridge.caret$bestTune$lambda)

where s is 242.0128 the coefficients are interpolated from the coefficients actually calculated.

Wheres when you provide lambda to the glmnet call the model returns exact coefficients for that lambda which differ only slightly from the interpolated ones caret returns.

Why this happens:

when you specify one alpha and one lambda for a fit on all of the data caret will actually fit:

   fit = function(x, y, wts, param, lev, last, classProbs, ...) {
                    numLev <- if(is.character(y) | is.factor(y)) length(levels(y)) else NA

                    theDots <- list(...)

                    if(all(names(theDots) != "family")) {
                      if(!is.na(numLev)) {
                        fam <- ifelse(numLev > 2, "multinomial", "binomial")
                      } else fam <- "gaussian"
                      theDots$family <- fam
                    }

                    ## pass in any model weights
                    if(!is.null(wts)) theDots$weights <- wts

                    if(!(class(x)[1] %in% c("matrix", "sparseMatrix")))
                      x <- Matrix::as.matrix(x)

                    modelArgs <- c(list(x = x,
                                        y = y,
                                        alpha = param$alpha),
                                   theDots)

                    out <- do.call(glmnet::glmnet, modelArgs)
                    if(!is.na(param$lambda[1])) out$lambdaOpt <- param$lambda[1]
                    out
                  }

this was taken from here.

in your example this translates to

fit <- glmnet::glmnet(x, y,
                       alpha = 0)

lambda <- unique(fit$lambda)

these lambda values correspond to ridge_full$finalModel$lambda:

all.equal(lambda, ridge_full$finalModel$lambda)
#output
TRUE

how to repeat hyperparameter tuning (alpha and/or lambda) of glmnet in mlr3

Repeated hyperparameter tuning (alpha and lambda) of glmnet can be done using the SECOND mlr3 approach as stated above.
The coefficients can be extracted with stats::coef and the stored values in the AutoTuner

coef(tune1$model$learner$model, alpha=tune1$tuning_result$alpha,s=tune1$tuning_result$s)
# 9 x 1 sparse Matrix of class "dgCMatrix"
# 1
# (Intercept) -1.6359082102
# age          0.0075541841
# glucose      0.0044351365
# insulin      0.0005821515
# mass         0.0077104934
# pedigree     0.0911233031
# pregnant     0.0164721202
# pressure     0.0007055435
# triceps      0.0056942014
coef(tune2$model$learner$model, alpha=tune2$tuning_result$alpha,s=tune2$tuning_result$s)
# 9 x 1 sparse Matrix of class "dgCMatrix"
# 1
# (Intercept) -1.6359082102
# age          0.0075541841
# glucose      0.0044351365
# insulin      0.0005821515
# mass         0.0077104934
# pedigree     0.0911233031
# pregnant     0.0164721202
# pressure     0.0007055435
# triceps      0.0056942014

Building a nested logistic regression model using caret, glmnet and a (nested) cross-validation

Thanks to @missuse I could solve my questions:

Cross-validation does not help get the most accurate model.
This (resp. my) misunderstanding is discussed beautifully in the blog post: The "Cross-Validation - Train/Predict" misunderstanding

The problem with seed depending variations of predictor's coefficients of glmnet in small datasets can be migitated with repeated cross-validation (ie, "repeatedcv" in caret::trainControl as described in the comments here)

Stacked learners (in my case stacked glmnet and glm) are usually built using out of fold predictions from lower level lerners. This could be done using the mlr3 package as described in this blog post: Tuning a stacked learner. Since this was not an initial question, I opened a new question here.

How to identify the non-zero coefficients in final caret elastic net model -

Since you do not provide any example data I post an example based on the iris built-in dataset, slightly modified to fit better your need (a binomial outcome).

First, modify the dataset

library(caret)
set.seed(5)#just for reproducibility
iris
irisn <- iris[iris$Species!="virginica",]
irisn$Species <- factor(irisn$Species,levels = c("versicolor","setosa"))
str(irisn)
summary(irisn)

fit the model (the caret function for setting controls parameters for train is trainControl, not train_control)

tr_control = trainControl(method="cv",number=10)
model1  <-  caret::train(Species~.,
                      data=irisn,
                      method="glmnet",
                      trControl=tr_control,
                      family = "binomial")

You can extract the coefficients of the final model as you already did:

data.frame(as.matrix(coef(model1$finalModel, model1$bestTune$lambda)))

Also here the model did not reduce any coefficients to 0, but what if we add a random variable that explains nothing about the outcome?

irisn$new1 <- runif(nrow(irisn))
model2  <-  caret::train(Species~.,
                         data=irisn,
                         method="glmnet",
                         trControl=tr_control,
                         family = "binomial")
var <- data.frame(as.matrix(coef(model2$finalModel, model2$bestTune$lambda)))

Here, as you can see, the coefficient of the new variable was turning to 0. You can extract the variable name retained by the model with:

rownames(var)[var$X1!=0]

Finally, the accuracy metrics from the test set can be obtained with

confusionMatrix(predict(model1,test),test$outcome)

Retrieve coefficients from R's train function in caret using forward regression and/or LARS

I'll show you how I do it with an example:

library(data.table)
n <- 1000
x1 <- runif(n,min=-10,max=10)
x2 <- runif(n,min=-10,max=10)
x3 <- runif(n,min=-10,max=10)
x4 <- runif(n,min=-10,max=10)
x5 <- runif(n,min=-10,max=10)
y1 <- 30 + x1 + 4*x2 + x3
synthetic <- data.table(x1=x1,x2=x2,x3=x3,x4=x4,x5=x5,y=y1)
library(caret)
library(lars)
ctrl <- trainControl(method = "cv", savePred=T, number=3)
fractionGrid <- expand.grid (fraction=seq(0,1,(1/(ncol(widedt)-1))))
cvresult <- train(y~.,
                  data=synthetic,
                  method = "lars",
                  trControl = ctrl,
                  metric="RMSE",
                  tuneGrid=fractionGrid,
                  use.Gram=FALSE)
coeffs <- predict.lars(cvresult$finalModel,type="coefficients")
models <- as.data.table(coeffs$coefficients)
winnermodelscoeffs <- models[which(coeffs$fraction==cvresult$bestTune$fraction)]

Extract the Coefficients for the Best Tuning Parameters of a Glmnet Model in Caret