Deep Learning Generalization, Over-Parameterization, Extrapolation, and Decision Boundaries

Roozbeh Yousefzadeh, Yale University
Computing Abstraction

Deep neural networks have achieved great success, most notably in learning to classify images. Yet, the phenomenon of learning images is not well understood, and generalization of deep networks is considered a mystery. Recent studies have explained the generalization of deep networks within the framework of interpolation. In this talk, we will see that the task of classifying images requires extrapolation capability, and interpolation by itself is not adequate to understand deep networks. We study image classification datasets in the pixel space, the internal representations of images learned throughout the layers of trained networks, and also in the low-dimensional feature space that one can derive using wavelets/shearlets. We show that in all these spaces, image classification remains an extrapolation task to a moderate (yet considerable) degree outside the convex hull of training set. From the mathematical perspective, a deep learning image classifier is a function that partitions its domain and assigns a class to each partition. Partitions are defined by decision boundaries and so is the model. Therefore, the extensions of decision boundaries outside the convex hull of training set are crucial in model's generalization. From this perspective, over-parameterization is a necessary condition for the ability to control the extensions of decision boundaries, a novel way of explaining why deep networks need to be over-parameterized. I will also present a homotopy algorithm for computing points on the decision boundaries of deep networks, and finally, I will explain how we can leverage the decision boundaries to audit and debug ML models used in social applications.


Bluejeans Link: