Error When I Try to Predict Class Probabilities in R - Caret

Error when trying to get class probabilities in R's caret package

There are a couple of issues.
First, this approach requires that the class levels of the factor follow the convention of valid
R variable names, so renaming the levels of the carb factor to start with a letter is the first step

mtcars$carb <- as.factor(paste0("c",mtcars$carb))

Second, the default argument of classProbs in TrainControl is set to FALSE.
This should be TRUE in your case.

library("caret")

tuneGrid <- expand.grid(mtry = c(10), min.node.size = c(1), splitrule = "extratrees")
rf_model <- train(carb ~ ., data = mtcars, method = "ranger",
trControl = trainControl(method = "none", classProbs = TRUE),
tuneGrid = tuneGrid)

classprobs <- predict(rf_model, newdata = mtcars, type = "prob")

support vector machine train caret error kernlab class probability calculations failed; returning NAs

In the train control statement, you have to specify if you want the class probabilities classProbs = TRUE returned.

svmFit <- train(class ~ .,
data = trainset,
method = "svmRadial",
preProc = c("center", "scale"),
tuneGrid = svmTuneGrid,
trControl = trainControl(method = "repeatedcv", repeats = 5,
classProbs = TRUE))

predictedClasses <- predict(svmFit, testset )
predictedProbs <- predict(svmFit, newdata = testset , type = "prob")

giving the probabilities of being in the Bad or Good class in the test dataset as:

print(predictedProbs)
Bad Good
1 0.2302979 0.7697021
2 0.7135050 0.2864950
3 0.2230889 0.7769111

EDIT

To answer your new question, you can access the position of the support vectors in your original data set with alphaindex(svmFit$finalModel) with coefficients coef(svmFit$finalModel).

Error when using predict() function on caret models in R

You can use the following code

log_class_predictions <- predict(logistic_model, new_data = pd_test)

log_predictions <- predict(logistic_model, new_data = pd_test, type = "prob")

How to get the class probabilities AND predictions from caret::predict()?

I make my comment into an answer.
Once you generate your prediction table of probabilities, you don't actually need to run twice the prediction function to get the classes. You can ask to add the class column by applying a simple which.max function (which runs fast imo). This will assign for each row the name of the column (one in the three c("setosa", "versicolor", "virginica")) based on which probability is the highest.

You get this table with both informations, as requested:

library(dplyr)
predict(knnFit, newdata = iris, type = "prob") %>%
mutate('class'=names(.)[apply(., 1, which.max)])
# a random sample of the resulting table:
#### setosa versicolor virginica class
#### 18 1 0.0000000 0.0000000 setosa
#### 64 0 0.6666667 0.3333333 versicolor
#### 90 0 1.0000000 0.0000000 versicolor
#### 121 0 0.0000000 1.0000000 virginica

ps: this uses the piping operator from dplyr or magrittr packages. The dot . indicates when you reuse the result from the previous instruction

Predict function from Caret package give an Error

Show us str(train) and str(test). I suspect the outcome variable is numeric, which makes train think that you are doing regression. That should also be apparent from printing model. Make it a factor if you want to do classification.

Max

Error using Caret Package for Random Forest (Regression)

You've specified classProbs=T in trainControl, which indicates class probabilities should be computed for a classification model (where the response variable consists of discrete class labels). However, that argument setting conflicts with your numeric response variable (which indicates a regression model will be trained), resulting in the error message that class probabilities cannot be computed for regression.

Since your description and numeric response variable indicate this is a regression problem, removing classProbs=T (the default setting is classProbs=F) from your code should address the error you're getting.

Caret and KNN in R: predict function gives error

The problem is your y variable. When you are asking for the class probabilities, the train and / or the predict function puts them into a data frame with a column for each class. If the factor levels are not valid variable names, they are automatically changed (e.g. "0" becomes "X0"). See also this post.

If you change this line in your code it should work:

a[,1] = factor(a[,1], labels = c("no", "yes"))


Related Topics



Leave a reply



Submit