What happens when we continue stacking deeper layers on a “plain” convolutional neural network?

What happens when we continue stacking deeper layers on a “plain” convolutional neural network?

http://cs231n.stanford.edu/slides/2017/cs231n_2017_lecture9.pdf

What happens when we continue stacking deeper layers on a “plain” convolutional neural network?

The deeper model performs worse, but it’s not caused by overfitting!