Modern data analysis increasingly requires flexible conditional inference P(X_B | X_A) where (X_A, X_B) is an arbitrary partition of observed variable X. Existing conditional inference methods lack this flexibility as they are tied to a fixed conditioning structure and cannot perform new conditional inference once trained. To solve this, we propose a Bayesian generative modeling (BGM) approach for arbitrary conditional inference without retraining. BGM learns a generative model of X through an iterative Bayesian updating algorithm where model parameters and latent variables are updated until convergence. Once trained, any conditional distribution can be obtained without retraining. Empirically, BGM achieves superior prediction performance with well calibrated predictive intervals, demonstrating that a single learned model can serve as a universal engine for conditional prediction with uncertainty quantification. We provide theoretical guarantees for the convergence of the stochastic iterative algorithm, statistical consistency and conditional-risk bounds. The proposed BGM framework leverages the power of AI to capture complex relationships among variables while adhering to Bayesian principles, emerging as a promising framework for advancing various applications in modern data science. The code for BGM is freely available at https://github.com/liuq-lab/bayesgm.
翻译:现代数据分析日益需要灵活的P(X_B | X_A)条件推断,其中(X_A, X_B)是观测变量X的任意划分。现有条件推断方法缺乏这种灵活性,因为它们受限于固定的条件结构,且训练后无法执行新的条件推断。为解决此问题,我们提出了一种无需重新训练的贝叶斯生成建模(BGM)方法用于任意条件推断。BGM通过迭代贝叶斯更新算法学习X的生成模型,其中模型参数和潜变量持续更新直至收敛。一旦训练完成,无需重新训练即可获得任意条件分布。实验表明,BGM在预测性能上表现优异,且具有校准良好的预测区间,证明单个学习模型可作为具备不确定性量化的通用条件预测引擎。我们为随机迭代算法的收敛性、统计一致性及条件风险界提供了理论保证。所提出的BGM框架在遵循贝叶斯原则的同时,利用人工智能的能力捕捉变量间的复杂关系,成为推动现代数据科学各类应用发展的前瞻性框架。BGM代码已在https://github.com/liuq-lab/bayesgm 开源。