Benign Autoencoders - 专知论文

Recent progress in Generative Artificial Intelligence (AI) relies on efficient data representations, often featuring encoder-decoder architectures. We formalize the mathematical problem of finding the optimal encoder-decoder pair and characterize its solution, which we name the "benign autoencoder" (BAE). We prove that BAE projects data onto a manifold whose dimension is the optimal compressibility dimension of the generative problem. We highlight surprising connections between BAE and several recent developments in AI, such as conditional GANs, context encoders, stable diffusion, stacked autoencoders, and the learning capabilities of generative models. As an illustration, we show how BAE can find optimal, low-dimensional latent representations that improve the performance of a discriminator under a distribution shift. By compressing "malignant" data dimensions, BAE leads to smoother and more stable gradients.

翻译：生成式人工智能（AI）的最新进展依赖于高效的数据表示，通常采用编码器-解码器架构。我们形式化了寻找最优编码器-解码器对的数学问题，并刻画了其解的特征，将其命名为“良性自编码器”（BAE）。我们证明BAE将数据投影到某个流形上，该流形的维数为生成问题的最优可压缩维数。我们揭示了BAE与AI中若干最新进展之间的惊人联系，例如条件生成对抗网络（GAN）、上下文编码器、稳定扩散、堆叠自编码器以及生成模型的学习能力。作为示例，我们展示了BAE如何找到最优的低维潜在表示，从而在分布偏移下提升判别器的性能。通过压缩“恶性”数据维度，BAE得以产生更平滑、更稳定的梯度。

相关内容

自编码器

关注 141

自动编码器是一种人工神经网络，用于以无监督的方式学习有效的数据编码。自动编码器的目的是通过训练网络忽略信号“噪声”来学习一组数据的表示（编码），通常用于降维。与简化方面一起，学习了重构方面，在此，自动编码器尝试从简化编码中生成尽可能接近其原始输入的表示形式，从而得到其名称。基本模型存在几种变体，其目的是迫使学习的输入表示形式具有有用的属性。自动编码器可有效地解决许多应用问题，从面部识别到获取单词的语义。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日