Multivariate Gaussian distributions enjoy Gaussian conditional distributions that makes conditioning easy: conditioning boils down to implementing analytical formulae for conditional means and covariances. For more general distributions, however, conditional distributions may not be available in analytical form and require demanding and approximate numerical approaches. Primarily motivatedby probabilistic imputation problems, we review and discuss families of multivariate distributions that do enjoy analytical conditioning, also providing a few counter-examples. Proving that trans-dimensional stability under conditioning extends to mixtures and transformations, we demonstrate that a broader class of multivariate distributions inherit easy conditioning properties. Building on this insight, we developed a generative method to estimate conditional distributions from data by first fitting a flexible joint distribution using copulas and then performing analytical conditioning in a latent space. In our applications, we specifically opt for Gaussian Mixture Copula Models (GMCM), comparing in turn various fitting strategies. Through simulations and real-world data experiments, we showcase the efficacy of our method in tasks involving conditional density estimation and data imputation. We also touch upon links to Gaussian process modelling and how stability by mixtures and transformations and mixtures carries over towards easy conditioning of non-Gaussian processes.
翻译:多元高斯分布因其条件分布仍为高斯分布而具有简易条件化的特性:条件化简化为对条件均值和协方差实施解析公式。然而,对于更一般的分布,条件分布可能无法以解析形式获得,需要采用计算量大且近似的数值方法。主要受概率插补问题的驱动,我们回顾并讨论了确实具有解析条件化的多元分布族,同时提供了一些反例。通过证明条件化下的跨维度稳定性可推广至混合分布和变换分布,我们展示了一类更广泛的多元分布继承了简易条件化的性质。基于这一见解,我们开发了一种生成式方法,该方法首先使用Copula拟合灵活的联合分布,随后在潜在空间中进行解析条件化,从而从数据中估计条件分布。在我们的应用中,我们特别选择了高斯混合Copula模型(GMCM),并比较了多种拟合策略。通过模拟和真实数据实验,我们展示了该方法在条件密度估计和数据插补任务中的有效性。我们还探讨了与高斯过程建模的联系,以及混合与变换的稳定性如何使得非高斯过程的简易条件化成为可能。