It is critical to understand in which situations a model is overfitting in. A common way of doing this for neural networks is to leverage a validation dataset. The validation set is separate from the testing set, because it is only used during training time. During training, the performance of the model on the training set, which is used to train the model, can be evaluated against the performance of the model on a validation set, which is not used to train the model.
Overfitting occurs when the training performance is consistently above the validation performance. For example, let’s say we get the following training performance curve and validation performance curve after training a model for 20 epochs:
The overfitting begins at the black circle, which is then the training accuracy is consistently above the testing accuracy.
In general, it is a good idea to keep a held-out validation set to evaluate the overfitting of the model and to know when to stop training the model. A common way to split up the data is to use 80% for training, 10% for validation, and 10% for testing. This is not a hard-and-fast rule. Depending on the dataset, it may or may no make more sense to, for example, use 60% for training, 20% for validation, and 20% for testing.