Towards a statistical understanding of deep neural network: beyond the neural tangent kernel theory

TopicLearning theory
Deep neural networks
FormatHybird
LocationDSDSNUSS16 07-107
SpeakerLin Qian
(THU)
Time (GMT+8)

Abstract

There are two major approaches in explaining the generalization ability of neural networks: the Holder theory, which ignores the dynamical properties of neural networks, and the neural tangent kernel theory, which is developed for wide neural networks. We will briefly review the recent results of these theories and some interesting implications, as well as discuss the challenges they face and possible solutions, such as the recent 'one-step' analysis of the dynamical properties of neural networks. If time permits, we will also introduce the 'adaptive kernel theory', a potential theory for explaining the effectiveness of neural networks