A Mini-Course on Machine Learning Visualizations - Part 5

Validation Curve

Sep 01, 2023

**An example of a validation curve** (Image by author)

Usage category: Hyperparameter Tuning

The validation curve plots the influence of a single hyperparameter on the train and validation sets. By looking at the curve, we can determine the overfitting, underfitting and just-right conditions of the model for some range of values of the hyperparameter that we want to tune [Ref: 10 Amazing Machine Learning Visualizations You Should Know in 2023].

We can only tune one hyperparameter at a time by using the validation curve. The validation curve cannot be used to tune multiple hyperparameters at once.

When the cross-validation accuracy score (test score) begins to decrease while the training accuracy still improving, the model begins to overfit the training data. We should select the hyperparameter value at that point. In the above plot, it is max_depth=6.

Explanation

In a validation curve, the x-axis represents the values of the hyperparameter that we want to tune and the y-axis represents the evaluation score (In classification, “accuracy” is mostly preferred. In regression, “r2” and “neg_mean_squared_error” are commonly used).

Both training and test scores are plotted. The test score is calculated using cross-validation to minimize the effect of randomness when splitting the data.

Additional resources

Creating and interpreting the validation curve

This post is a part of my original post published on Medium.

Designed and written by:

Rukshan Pramoditha

Data Science Masterclass

2023–09–01

Data Science Masterclass

A Mini-Course on Machine Learning Visualizations - Part 5

Validation Curve

Explanation

Additional resources

Discussion about this post