Fit a no-intercept model in caret
As discussed in a linked SO question https://stackoverflow.com/a/41731117/7613376, this works in caret v6.0.76 (And the trace answer above no longer seems to work with code refactoring in caret):
caret_lmFit <- train(Sepal.Length~0+Petal.Length+Petal.Width, data=iris, "lm",
tuneGrid = expand.grid(intercept = FALSE))
> caret_lmFit$finalModel
Call:
lm(formula = .outcome ~ 0 + ., data = dat)
Coefficients:
Petal.Length Petal.Width
2.856 -4.479
Logistic Regression in Caret - No Intercept?
There's a vignette on how to set up a custom model for caret. So in the solution below, you can also see why the intercept persist:
library(caret)
glm_wo_intercept = getModelInfo("glm",regex=FALSE)[[1]]
if you look at the fit, there's a line that does:
glm_wo_intercept$fit
....
modelArgs <- c(list(formula = as.formula(".outcome ~ ."), data = dat), theDots)
...
So the intercept is there by default. You can change this line and run caret on this modified model:
glm_wo_intercept$fit = function(x, y, wts, param, lev, last, classProbs, ...) {
dat <- if(is.data.frame(x)) x else as.data.frame(x)
dat$.outcome <- y
if(length(levels(y)) > 2) stop("glm models can only use 2-class outcomes")
theDots <- list(...)
if(!any(names(theDots) == "family"))
{
theDots$family <- if(is.factor(y)) binomial() else gaussian()
}
if(!is.null(wts)) theDots$weights <- wts
# change the model here
modelArgs <- c(list(formula = as.formula(".outcome ~ 0+."), data = dat), theDots)
out <- do.call("glm", modelArgs)
out$call <- NULL
out
}
We fit the model:
data = data.frame(y=factor(runif(100)>0.5),x=rnorm(100))
model <- train(y ~ 0+ x, data = data, method = glm_wo_intercept,
family = binomial(),trControl = trainControl(method = "cv",number=3))
predict(model,data.frame(x=0),type="prob")
FALSE TRUE
1 0.5 0.5
How to fit a model without an intercept using R tidymodels workflow?
You can use the formula
argument to add_model()
to override the terms of the model. This is typically used for survival and Bayesian models, so be extra careful that you know what you are doing here, because you are circumventing some of the guardrails of tidymodels by doing this:
library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#> method from
#> required_pkgs.model_spec parsnip
mod <- linear_reg()
rec <- recipe(mpg ~ cyl + wt, data = mtcars)
workflow() %>%
add_recipe(rec) %>%
add_model(mod, formula = mpg ~ 0 + cyl + wt) %>%
fit(mtcars)
#> ══ Workflow [trained] ══════════════════════════════════════════════════════════
#> Preprocessor: Recipe
#> Model: linear_reg()
#>
#> ── Preprocessor ────────────────────────────────────────────────────────────────
#> 0 Recipe Steps
#>
#> ── Model ───────────────────────────────────────────────────────────────────────
#>
#> Call:
#> stats::lm(formula = mpg ~ 0 + cyl + wt, data = data)
#>
#> Coefficients:
#> cyl wt
#> 2.187 1.174
Created on 2021-09-01 by the reprex package (v2.0.1)
Using linear regression (lm) in R caret, how do I force the intercept through 0?
You can take advantage of the tuneGrid
parameter in caret::train
.
regressControl <- trainControl(method="repeatedcv",
number = 4,
repeats = 5
)
regress <- train(mpg ~ hp,
data = mtcars,
method = "lm",
trControl = regressControl,
tuneGrid = expand.grid(intercept = FALSE))
Use getModelInfo("lm", regex = TRUE)[[1]]$param
to see all the things you could have tweaked in tuneGrid
(in the lm case, the only tuning parameter is the intercept). It's silly that you can't simply rely on formula
syntax, but alas.
Related Topics
How to Request an Early Exit When Knitting an Rmd Document
Time-Series - Data Splitting and Model Evaluation
R Shiny - Disable/Able Shinyui Elements
R: How to Total the Number of Na in Each Col of Data.Frame
Sort Matrix According to First Column in R
Differencebetween Geoms and Stats in Ggplot2
How to Round a Data.Frame in R That Contains Some Character Variables
Transparent Equivalent of Given Color
How to Count How Many Values Per Level in a Given Factor
Automatic Documentation of Datasets
How to Highlight Time Ranges on a Plot
Changing Title in Multiplot Ggplot2 Using Grid.Arrange
How to Order a Data Frame by One Descending and One Ascending Column
Filtering Observations in Dplyr in Combination with Grepl