validation

16 Dec, 2014

Kaggle Titanic Competition Part X – ROC Curves and AUC

2017-01-30T13:49:35-08:00December 16th, 2014|0 Comments

In the last post, we looked at how to generate and interpret learning curves to validate how well our model is performing. Today we'll take a look at another popular diagnostic used to figure out how well our model is performing. The Receiver Operating Characteristic (ROC curve) is a chart that illustrates how the true positive rate and false positive rate of a binary classifier vary as the discrimination threshold changes. Did that make any sense? Probably not, hopefully it will by the time we're finished. An important thing to keep in mind is that ROC is all about [...]

12 Dec, 2014

Kaggle Titanic Competition Part IX – Bias, Variance, and Learning Curves

2017-01-30T13:51:33-08:00December 12th, 2014|0 Comments

In the previous post, we took at how we can search for the best set of hyperparameters to provide to our model. Our measure of "best" in this case is to minimize the cross validated error. We can be reasonably confident that we're doing about as well as we can with the features we've provided and the model we've chosen. But before we can run off and use this model on totally new data with any confidence, we would like to do a little validation to get an idea of how the model will do out in the wild. [...]

Go to Top