Recent progress in Generative Artificial Intelligence (AI) relies on efficient data representations, often featuring encoder-decoder architectures. We formalize the mathematical problem of finding the optimal encoder-decoder pair and characterize its solution, which we name the "benign autoencoder" (BAE). We prove that BAE projects data onto a manifold whose dimension is the optimal compressibility dimension of the generative problem. We highlight surprising connections between BAE and several recent developments in AI, such as conditional GANs, context encoders, stable diffusion, stacked autoencoders, and the learning capabilities of generative models. As an illustration, we show how BAE can find optimal, low-dimensional latent representations that improve the performance of a discriminator under a distribution shift. By compressing "malignant" data dimensions, BAE leads to smoother and more stable gradients.
翻译:生成式人工智能(AI)的最新进展依赖于高效的数据表示,通常采用编码器-解码器架构。我们将寻找最优编码器-解码器的数学问题形式化,并刻画其解的特征,将其命名为"良性自编码器"(BAE)。我们证明BAE将数据投影到流形上,该流形的维数即为生成问题的最优可压缩维数。我们揭示了BAE与AI领域近期若干进展之间的惊人联系,包括条件生成对抗网络、上下文编码器、稳定扩散、堆叠自编码器以及生成模型的学习能力。作为示例,我们展示了BAE如何能够找到最优的低维潜在表示,从而在分布偏移下提升判别器的性能。通过压缩"恶性"数据维度,BAE能够产生更平滑、更稳定的梯度。