Deep learning has achieved remarkable success across a wide range of domains, significantly expanding the frontiers of what is achievable in artificial intelligence. Yet, despite these advances, critical challenges remain -- most notably, ensuring robustness to small input perturbations and generalization to out-of-distribution data. These critical challenges underscore the need to understand the underlying fundamental principles that govern robustness and generalization. Among the theoretical tools available, Lipschitz continuity plays a pivotal role in governing the fundamental properties of neural networks related to robustness and generalization. It quantifies the worst-case sensitivity of network's outputs to small input perturbations. While its importance is widely acknowledged, prior research has predominantly focused on empirical regularization approaches based on Lipschitz constraints, leaving the underlying principles less explored. This thesis seeks to advance a principled understanding of the principles of Lipschitz continuity in neural networks within the paradigm of machine learning, examined from two complementary perspectives: an internal perspective -- focusing on the temporal evolution of Lipschitz continuity in neural networks during training (i.e., training dynamics); and an external perspective -- investigating how Lipschitz continuity modulates the behavior of neural networks with respect to features in the input data, particularly its role in governing frequency signal propagation (i.e., modulation of frequency signal propagation).
翻译:深度学习已在众多领域取得显著成功,极大地拓展了人工智能可实现的前沿边界。然而,尽管取得了这些进展,关键挑战依然存在——最突出的是确保模型对小规模输入扰动的鲁棒性以及对分布外数据的泛化能力。这些关键挑战凸显了理解支配鲁棒性与泛化性的基础原理的必要性。在现有的理论工具中,Lipschitz连续性在支配神经网络与鲁棒性和泛化性相关的基本属性方面起着关键作用。它量化了网络输出对小规模输入扰动的最坏情况敏感性。尽管其重要性已得到广泛认可,先前研究主要集中于基于Lipschitz约束的经验正则化方法,对其底层原理的探索相对不足。本论文旨在机器学习范式内推进对神经网络中Lipschitz连续性原理的系统性理解,从两个互补视角进行研究:内部视角——聚焦于训练过程中神经网络Lipschitz连续性的时序演化(即训练动力学);外部视角——探究Lipschitz连续性如何调节神经网络对输入数据特征的行为,特别是其在控制频率信号传播中的作用(即频率信号传播的调制)。