Recent progress in Generative Artificial Intelligence (AI) relies on efficient data representations, often featuring encoder-decoder architectures. We formalize the mathematical problem of finding the optimal encoder-decoder pair and characterize its solution, which we name the "benign autoencoder" (BAE). We prove that BAE projects data onto a manifold whose dimension is the optimal compressibility dimension of the generative problem. We highlight surprising connections between BAE and several recent developments in AI, such as conditional GANs, context encoders, stable diffusion, stacked autoencoders, and the learning capabilities of generative models. As an illustration, we show how BAE can find optimal, low-dimensional latent representations that improve the performance of a discriminator under a distribution shift. By compressing "malignant" data dimensions, BAE leads to smoother and more stable gradients.
翻译:生成式人工智能(AI)的最新进展依赖于高效的数据表示,通常采用编码器-解码器架构。我们形式化了寻找最优编码器-解码器对的数学问题,并刻画了其解的特征,将其命名为“良性自编码器”(BAE)。我们证明BAE将数据投影到某个流形上,该流形的维数为生成问题的最优可压缩维数。我们揭示了BAE与AI中若干最新进展之间的惊人联系,例如条件生成对抗网络(GAN)、上下文编码器、稳定扩散、堆叠自编码器以及生成模型的学习能力。作为示例,我们展示了BAE如何找到最优的低维潜在表示,从而在分布偏移下提升判别器的性能。通过压缩“恶性”数据维度,BAE得以产生更平滑、更稳定的梯度。