Deep neural networks (DNNs) achieve remarkable performance on a wide range of tasks, yet their mathematical analysis remains fragmented: stability and generalization are typically studied in disparate frameworks and on a case-by-case basis. Architecturally, DNNs rely on the recursive application of parametrized functions, a mechanism that can be unstable and difficult to train, making stability a primary concern. Even when training succeeds, there are few rigorous results on how well such models generalize beyond the observed data, especially in the generative setting. In this work, we leverage the theory of stochastic Iterated Function Systems (IFS) and show that two important deep architectures can be viewed as, or canonically associated with, place-dependent IFS. This connection allows us to import results from random dynamical systems to (i) establish the existence and uniqueness of invariant measures under suitable contractivity assumptions, and (ii) derive a Wasserstein generalization bound for generative modeling. The bound naturally leads to a new training objective that directly controls the collage-type approximation error between the data distribution and its image under the learned transfer operator. We illustrate the theory on a controlled 2D example and empirically evaluate the proposed objective on standard image datasets (MNIST, CelebA, CIFAR-10).
翻译:深度神经网络(DNNs)在广泛的任务中取得了卓越的性能,然而其数学分析仍然零散:稳定性和泛化性通常在不同的框架中以及针对具体案例进行研究。从架构上看,DNNs依赖于参数化函数的递归应用,这种机制可能不稳定且难以训练,使得稳定性成为一个主要关切点。即使训练成功,关于此类模型在观测数据之外(尤其是在生成式场景中)的泛化能力如何,目前也缺乏严格的理论结果。在本研究中,我们利用随机迭代函数系统(IFS)的理论,并证明两种重要的深度架构可以被视为或规范地关联于位置依赖的IFS。这一联系使我们能够引入随机动力系统的结果,以(i)在适当的压缩性假设下建立不变测度的存在性与唯一性,以及(ii)为生成式建模推导出一个Wasserstein泛化界。该界自然地导出了一个直接控制数据分布与其在所学转移算子下的像之间拼贴型逼近误差的新训练目标。我们在一个受控的二维示例上阐释了该理论,并在标准图像数据集(MNIST、CelebA、CIFAR-10)上对所提出的目标进行了实证评估。