Modern data analysis increasingly requires flexible conditional inference P(X_B | X_A) where (X_A, X_B) is an arbitrary partition of observed variable X. Existing approaches are either restricted to a fixed conditioning structure or depend strongly on the distribution of conditioning masks during training. To address these limitations, we introduce Bayesian generative modeling (BGM), a unified framework for arbitrary conditional inference. BGM learns a generative model of X via a stochastic iterative Bayesian updating algorithm in which model parameters and latent variables are updated until convergence. Once trained, any conditional distribution can be obtained without retraining. Empirically, BGM achieves superior predictive performance with posterior predictive intervals, demonstrating that a single learned model can serve as a universal engine for conditional prediction with principled uncertainty quantification. We provide theoretical guarantees for convergence of the stochastic iterative algorithm, statistical consistency, and conditional risk bounds. The proposed BGM framework leverages modern AI to capture complex relationships among variables while adhering to Bayesian principles, offering a promising approach for a wide range of applications in modern data science. Code for BGM is available at https://github.com/liuq-lab/bayesgm. Document of BGM is available at https://bayesgm.readthedocs.io.
翻译:现代数据分析日益需要灵活的任意条件推断 P(X_B | X_A),其中 (X_A, X_B) 是观测变量 X 的任意划分。现有方法要么局限于固定的条件结构,要么在训练过程中严重依赖于条件掩码的分布。为解决这些局限性,我们引入了贝叶斯生成建模(BGM),一个用于任意条件推断的统一框架。BGM 通过一种随机迭代贝叶斯更新算法学习 X 的生成模型,在该算法中模型参数和潜变量被更新直至收敛。一旦训练完成,无需重新训练即可获得任意条件分布。实证结果表明,BGM 凭借其后验预测区间实现了卓越的预测性能,证明单个学习到的模型可作为具有原则性不确定性量化的条件预测通用引擎。我们为随机迭代算法的收敛性、统计一致性以及条件风险界提供了理论保证。所提出的 BGM 框架利用现代人工智能捕捉变量间的复杂关系,同时遵循贝叶斯原理,为现代数据科学中的广泛应用提供了一种有前景的方法。BGM 的代码可在 https://github.com/liuq-lab/bayesgm 获取。BGM 的文档可在 https://bayesgm.readthedocs.io 获取。