A broad class of models that routinely appear in several fields can be expressed as partially or fully discretized Gaussian linear regressions. Besides including basic Gaussian response settings, this class also encompasses probit, multinomial probit and tobit regression, among others, thereby yielding to one of the most widely-implemented families of models in applications. The relevance of such representations has stimulated decades of research in the Bayesian field, mostly motivated by the fact that, unlike for Gaussian linear regression, the posterior distribution induced by such models does not seem to belong to a known class, under the commonly-assumed Gaussian priors for the coefficients. This has motivated several solutions for posterior inference relying on sampling-based strategies or on deterministic approximations that, however, still experience computational and accuracy issues, especially in high dimensions. The scope of this article is to review, unify and extend recent advances in Bayesian inference and computation for this class of models. To address such a goal, we prove that the likelihoods induced by these formulations share a common analytical structure that implies conjugacy with a broad class of distributions, namely the unified skew-normals (SUN), that generalize Gaussians to skewed contexts. This result unifies and extends recent conjugacy properties for specific models within the class analyzed, and opens avenues for improved posterior inference, under a broader class of formulations and priors, via novel closed-form expressions, i.i.d. samplers from the exact SUN posteriors, and more accurate and scalable approximations from VB and EP. Such advantages are illustrated in simulations and are expected to facilitate the routine-use of these core Bayesian models, while providing a novel framework to study theoretical properties and develop future extensions.
翻译:多个学科中广泛出现的一类模型可以表达为部分或完全离散化的高斯线性回归。除包含基本高斯响应设定外,此类模型还涵盖probit、多项probit与tobit回归等,从而构成了应用最广泛的模型族之一。这类表示的重要性推动了贝叶斯领域数十年的研究,其主要动因在于:与高斯线性回归不同,在常用的系数高斯先验假设下,这类模型产生的后验分布似乎不属于任何已知分布族。这促使研究者提出多种依赖采样策略或确定性近似的后验推断方法,然而这些方法在高维场景下仍存在计算效率与精度问题。本文旨在综述、统一并拓展该类模型贝叶斯推断与计算的最新进展。为实现此目标,我们证明这些模型诱导的似然函数具有共通的解析结构,该结构意味着其与广义统一偏态正态(SUN)分布族——即高斯分布在偏态场景中的推广——具有共轭性。这一结果统一并拓展了该模型族中特定模型近期发现的共轭性质,并在更广泛的模型公式与先验设定下,通过新型闭式表达式、精确SUN后验的独立同分布采样器,以及基于变分贝叶斯(VB)和期望传播(EP)的更精确可扩展近似,为改进后验推断开辟了新途径。仿真实验验证了这些优势,预期将促进这些核心贝叶斯模型的常规化应用,同时为研究理论性质与开发未来扩展提供新框架。