Calculating Prediction Accuracy of a Tree Using Rpart's Predict Method

Calculating prediction accuracy of a tree using rpart's predict method

Try calculating the confusion matrix first:

confMat <- table(test$class,t_pred)

Now you can calculate the accuracy by dividing the sum diagonal of the matrix - which are the correct predictions - by the total sum of the matrix:

accuracy <- sum(diag(confMat))/sum(confMat)

Gibberish Output in RPart plot in R

The algorithm replaces the levels of each factor by lower and upper case letters in the alphabet. If there are more than 56 levels in a factor, the Z letter is repeated, so it is not recommended to use factors with more than 56 levels as input to a rpart model.

However, it is possible to avoid the unwanted "gibberish" output: if you are using plot() + text(), try using the "pretty" parameter in the text() function. Example:

plot(tree)
text(tree, pretty=1)

Other output functions have their specific parameter for that. "labels()" for instance, has the "minlength" parameter:

labels(tree)
labels(tree,minlength=0)

I hope that helps.

Strange Behavior for the predict() function

Unfortunately, even though rpart only used the wt variable in splits, prediction still requires the others to be present. Use a data set with the sample columns:

> predict(model, mtcars[1,])
[1] 0.8571429

Max

Calculating Prediction Accuracy of a Tree Using Rpart's Predict Method