Overfitting must be avoided
Overfitting is one of the most common issues you might find when training a Deep Learning algorithm. Here are some strategies on how to overcome it (non exhaustive list).
In short, you need to increase the generalization of your network. Here are some ways to do so:
- Get more training data (expensive? not always possible/useful?)
- Reduce the capacity (degrees of liberty or complexity) of the network (smaller achitecture for a not so big problem/dataset). Alternatively one can use architectures that generalize quite well, such as the inverted convolutional pyramid comonly found in image classification.
- Add weight regularization (or other kinds of regularization)
- Add dropout. It acts as a natural regularizer.
- Data augmentation, to artificially increase the amount of data you have while still keeping it meaningful.
- Batch Normalization. From the original paper:
Batch normalization reduces the dependence of your network to your weight initialization. Adds regularization into your network.
- Hyperparameter tuning: Modify learning rate, batch size… to find the sweet configuration that works. As an aditional source, I suggest reading “A Disciplined Approach to Neural Network Hyper-Parameters” by Leslie N. Smith.
Subscribe to Be A Voice, not an Echo
Get the latest posts delivered right to your inbox