Why do autoencoders work?

Deep neural network autoencoders are routinely used computationally for model reduction. They allow recognizing the intrinsic dimension of data that lie in a $k$-dimensional subset $K$ of an input Euclidean space $\R^n$. The underlying idea is to obtain both an encoding layer that maps $\R^n$ into $\R^k$ (called the bottleneck layer or the space of latent variables) and a decoding layer that maps $\R^k$ back into $\R^n$, in such a way that the input data from the set $K$ is recovered when composing the two maps. This is achieved by adjusting parameters (weights) in the network to minimize the discrepancy between the input and the reconstructed output. Since neural networks (with continuous activation functions) compute continuous maps, the existence of a network that achieves perfect reconstruction would imply that $K$ is homeomorphic to a $k$-dimensional subset of $\R^k$, so clearly there are topological obstructions to finding such a network. On the other hand, in practice the technique is found to ``work'' well, which leads one to ask if there is a way to explain this effectiveness. We show that, up to small errors, indeed the method is guaranteed to work. This is done by appealing to certain facts from differential geometry. A computational example is also included to illustrate the ideas.

翻译：深度神经网络自编码器通常被用于降阶计算。它们能够识别位于输入欧几里得空间 $\R^n$ 中 $k$ 维子集 $K$ 的数据的内在维度。其核心思想是构造一个编码层（将 $\R^n$ 映射到 $\R^k$，称为瓶颈层或潜变量空间）和一个解码层（将 $\R^k$ 映射回 $\R^n$），使得来自集合 $K$ 的输入数据在两层映射复合后得以恢复。这通过调整网络中的参数（权重）以最小化输入与重构输出之间的差异来实现。由于神经网络（使用连续激活函数）计算连续映射，若存在能实现完美重构的网络，则意味着 $K$ 与 $\R^k$ 的某个 $k$ 维子集同胚，因此寻找此类网络显然存在拓扑障碍。然而，实践中该技术效果显著，这促使人们思考如何解释其有效性。我们证明，在微小误差范围内，该方法确实能保证有效。这一结论通过援引微分几何中的若干事实得出，并附计算示例以阐明其思想。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日