Generative models typically rely on either simple latent priors (e.g., Variational Autoencoders, VAEs), which are efficient but limited, or highly expressive iterative samplers (e.g., Diffusion and Energy-based Models), which are costly and opaque. We introduce the Kolmogorov-Arnold Energy Model (KAEM) to bridge this trade-off and provide new opportunities for latent-space interpretability. Based on a novel adaptation of the Kolmogorov-Arnold Representation Theorem, KAEM imposes a univariate latent structure on the prior, enabling exact inference via the inverse transform method. With a low-dimensional latent space and appropriate inductive biases, importance sampling becomes a tractable, unbiased, and efficient posterior inference method. For settings where this fails, we propose a population-based strategy that decomposes the posterior into a sequence of annealed distributions, a new remedy for poor mixing in Energy-based Models. We compare KAEM against VAEs, the neural latent EBM architecture, and a denoising diffusion probabilistic model. Across SVHN, CIFAR10, and CelebA, KAEM attains the best Fréchet Inception Distance among latent-prior models, while sampling in a single forward pass and exposing an interpretable prior built from 1D densities.
翻译:生成模型通常依賴於兩種方式:要麼是簡單的潛在先驗(例如變分自編碼器,VAE),這種方法雖然高效但能力有限;要麼是高度表達性的迭代採樣器(例如擴散模型和基於能量的模型),這種方法成本高昂且不透明。我們提出科爾莫戈羅夫-阿諾德能量模型(KAEM),以橋接這種權衡,併為潛在空間的可解釋性提供新的可能性。基於科爾莫戈羅夫-阿諾德表示定理的新型改編,KAEM在先驗上施加了單變量潛在結構,從而能夠通過逆變換方法進行精確推斷。憑藉低維潛在空間和適當的歸納偏置,重要性採樣成為一種可處理、無偏且高效的後驗推斷方法。對於此方法失效的情況,我們提出了一種基於群體的策略,將後驗分解為一系列退火分佈,這為基於能量的模型中的混合不良問題提供了新的補救措施。我們將KAEM與VAE、神經潛在EBM架構以及去噪擴散概率模型進行了比較。在SVHN、CIFAR10和CelebA數據集上,KAEM在潛在先驗模型中取得了最佳Fréchet Inception距離,同時能在單次前向傳遞中完成採樣,並展現出由一維密度構建的可解釋先驗。