Page History

...

Another common mistake is to run 1-level cross-validation with multiple models, and report the correct rate of the best model as the deployed model estimate of generalization correct rate, This correct rate is optimistically biased. The reason is that in 1-level cross validation, the test set is used to select the best model, it can't be used to get accuracy estimate since it the test set is not independent anymore in terms of estimating correct rate on a unseen dataset. So either use 2-level cross-validation option or use another independ set to get the accuracy estimate. Below is an example to demostrate this concept:

, the idea here is to partition the data into 3 sets: training set, validation set and test set. Train the models on the training set, validation set is used to select the best model, and test set is used to generate an unbiased accuracy estimate.

Additional assistance

Rate Macro

allowUsers	false

PGS Documentation

Page tree

Versions Compared

Old Version 6

New Version 7

Key