Understanding the latent spaces learned by deep learning models is crucial in exploring how they represent and generate complex data. Autoencoders (AEs) have played a key role in the area of representation learning, with numerous regularization techniques and training principles developed not only to enhance their ability to learn compact and robust representations, but also to reveal how different architectures influence the structure and smoothness of the lower-dimensional non-linear manifold. We strive to characterize the structure of the latent spaces learned by different autoencoders including convolutional autoencoders (CAEs), denoising autoencoders (DAEs), and variational autoencoders (VAEs) and how they change with the perturbations in the input. By characterizing the matrix manifolds corresponding to the latent spaces, we provide an explanation for the well-known observation that the latent spaces of CAE and DAE form non-smooth manifolds, while that of VAE forms a smooth manifold. We also map the points of the matrix manifold to a Hilbert space using distance preserving transforms and provide an alternate view in terms of the subspaces generated in the Hilbert space as a function of the distortion in the input. The results show that the latent manifolds of CAE and DAE are stratified with each stratum being a smooth product manifold, while the manifold of VAE is a smooth product manifold of two symmetric positive definite matrices and a symmetric positive semi-definite matrix.
翻译:理解深度学习模型所学习的隐空间对于探索其如何表示和生成复杂数据至关重要。自编码器在表示学习领域发挥了关键作用,人们开发了众多正则化技术和训练原则,不仅旨在增强其学习紧凑且鲁棒表示的能力,同时也为了揭示不同架构如何影响低维非线性流形的结构和平滑性。我们致力于表征不同自编码器(包括卷积自编码器、去噪自编码器和变分自编码器)所学习隐空间的结构,以及这些结构如何随输入扰动而变化。通过表征与隐空间对应的矩阵流形,我们为以下广为人知的观察提供了理论解释:CAE和DAE的隐空间形成非光滑流形,而VAE的隐空间则形成光滑流形。我们还通过保距变换将矩阵流形上的点映射到希尔伯特空间,并从希尔伯特空间中生成的子空间(作为输入失真函数)的角度提供了另一种视角。结果表明,CAE和DAE的隐流形是分层的,每一层均为光滑积流形;而VAE的流形则是由两个对称正定矩阵和一个对称半正定矩阵构成的光滑积流形。