different results using train(), predict() and resamples()
I'm using the Caret package to analyse various models and I'm assessing the results using:
Why are these results in the following example different?
I'm interested in sensitivity (true positives). Why is J48_fit assessed as having a sensitivity of .71, then .81, then .71 again
The same happens when I run other models - the sensitivity changes depending on the assessment.
NB: I have included two models here so as to illustrate the resamples() function, which has to take two models as input, but my main question is about the differences between the results depending on what method one uses.
In other words, what is the difference between the outcome of train() (C5.0_fit/J48_fit), predict() and resamples()? What is going on 'behind the scenes' and which result should I trust?
EXAMPLE:
library(C50)
data(churn)
Seed <- 10
# Set train options
set.seed(Seed)
Train_options <- trainControl(method = "cv", number = 10,
classProbs = TRUE,
summaryFunction = twoClassSummary)
# C5.0 model:
set.seed(Seed)
C5.0_fit <- train(churn~., data=churnTrain, method="C5.0", metric="ROC",
trControl=Train_options)
# J48 model:
set.seed(Seed)
J48_fit <- train(churn~., data=churnTrain, method="J48", metric="ROC",
trControl=Train_options)
# Get results by printing the outcome
print(J48_fit)
# ROC Sens Spec
# Best (sensitivity): 0.87 0.71 0.98
# Get results using predict()
set.seed(Seed)
J48_fit_predict <- predict(J48_fit, churnTrain)
confusionMatrix(J48_fit_predict, churnTrain$churn)
# Reference
# Prediction yes no
# yes 389 14
# no 94 2836
# Sens : 0.81
# Spec : 0.99
# Get results by comparing algorithms with resamples()
set.seed(Seed)
results <- resamples(list(C5.0_fit=C5.0_fit, J48_fit=J48_fit))
summary(results)
# ROC mean
# C5.0_fit 0.92
# J48_fit 0.87
# Sens mean
# C5.0_fit 0.76
# J48_fit 0.71
# Spec mean
# C5.0_fit 0.99
# J48_fit 0.98
By the way, here is a function for getting all three results together:
Get_results <- function(...){
Args <- list(...)
Model_names <- as.list(sapply(substitute({...})[-1], deparse))
message("Model names:")
print(Model_names)
# Function for getting max sensitivity
Max_sens <- function(df, colname = "results"){
df <- df[[colname]]
new_df <- df[which.max(df$Sens), ]
x <- sapply(new_df, is.numeric)
new_df[, x] <- round(new_df[, x], 2)
new_df
}
# Find max Sens for each model
message("Max sensitivity from model printout:")
Max_sens_out <- lapply(Args, Max_sens)
names(Max_sens_out) <- Model_names
print(Max_sens_out)
# Find predict() result for each model
message("Results using predict():")
set.seed(Seed)
Predict_out <- lapply(Args, function(x) predict(x, churnTrain))
Predict_results <- lapply(Predict_out, function(x) confusionMatrix(x, churnTrain$churn))
names(Predict_results) <- Model_names
print(Predict_results)
# Find resamples() results for each model
message("Results using resamples():")
set.seed(Seed)
results <- resamples(list(...),modelNames = Model_names)
# names(results) <- Model_names
summary(results)
}
# Test
Get_results(C5.0_fit, J48_fit)
Many thanks!
The best sensitivity that you printed is the average of model performance across each of the 10 folds (from your CV). You can see the performance for each fold with J48_fit$resample
. Then to confirm, you can take the mean of the first column, ROC, with mean(J48_fit$resample[,1])
and you'll get 0.865799.
When you use predict()
on the full dataset you'll end up with a different result because the data is different than what was used in the resample - you're getting model performance on the whole data, instead of on 10% at a time.
上一篇: 插入符号保存最小尺寸模型