Variational inference (VI) is a core engine of modern AI, enabling scalable approximate Bayesian learning and uncertainty-aware training of large probabilistic and generative models. In this paper, we propose Structured Nonparametric Variational Inference (SN-VI), a novel framework for modeling complex dependencies among latent variables in posterior approximation, leveraging multivariate spline techniques. Unlike traditional methods that rely on the mean-field assumption, SN-VI preserves intricate latent variable dependencies, providing a flexible and accurate approximation of posteriors with arbitrary shapes. We establish rigorous theoretical guarantees, including the derivation of the lower bound for the variational objective and proof of asymptotic consistency in posterior estimation. To facilitate practical implementation, we develop an algorithm that automatically identifies dependent latent variables and their underlying dependence structure, without requiring manual specification. Simulation studies validate the effectiveness of SN-VI in approximating posterior distributions with bounded support and complex dependencies. The proposed method has been successfully applied to high-dimensional structured data, including computer vision datasets and spatial transcriptomics. In these applications, SN-VI demonstrates improved generative model performance and effectively uncovers coupled biological signals through the learned dependency structure.
翻译:变分推断(VI)是现代人工智能的核心引擎,能够实现大规模概率模型和生成模型的可扩展近似贝叶斯学习及不确定性感知训练。本文提出结构化非参数变分推断(SN-VI)——一种利用多元样条技术对后验近似中隐变量间的复杂依赖关系进行建模的新颖框架。与依赖平均场假设的传统方法不同,SN-VI保留了隐变量间的复杂依赖结构,能够灵活准确地逼近任意形状的后验分布。我们建立了严谨的理论保证,包括推导变分目标的下界以及证明后验估计的渐近一致性。为便于实际应用,我们开发了一种无需手动指定即可自动识别依赖隐变量及其底层依赖结构的算法。仿真研究验证了SN-VI在逼近具有有界支撑和复杂依赖关系的后验分布方面的有效性。该方法已成功应用于高维结构化数据,包括计算机视觉数据集和空间转录组学数据。在这些应用中,SN-VI通过其学习的依赖结构提升了生成模型性能,并有效揭示了耦合的生物信号。