Handling outliers is a fundamental challenge in multivariate data analysis because outliers may distort the structures of correlation or conditional independence. Although robust Bayesian inference has been extensively studied in univariate settings, theoretical results ensuring posterior robustness in multivariate models are scarce. We propose a novel scale mixture of multivariate normals called correlation-intact sandwich mixtures, in which the scale parameters are real values and follow an unfolded log-Pareto distribution. Our theoretical results on posterior robustness in multivariate settings emphasize that the use of a symmetric, super heavy-tailed distribution for scale parameters is essential for achieving posterior robustness against element-wise contamination. The posterior inference for the proposed model is feasible using the developed efficient Gibbs sampling algorithm. The superiority of the proposed method was further illustrated further in simulation and empirical studies using graphical models and multivariate regression in the presence of complex outlier structures.
翻译:处理异常值是多元数据分析中的基本挑战,因为异常值可能扭曲相关性或条件独立性的结构。尽管稳健贝叶斯推断在单变量设定中已得到广泛研究,但确保多元模型后验稳健性的理论结果仍然稀缺。我们提出了一种新颖的多元正态尺度混合模型,称为相关性保持的三明治混合模型,其中尺度参数为实数值并服从展开的对数帕累托分布。我们在多元设定中关于后验稳健性的理论结果表明,对尺度参数使用对称的超重尾分布对于实现针对逐元素污染的后验稳健性至关重要。所提出模型的后验推断可通过已开发的高效吉布斯采样算法实现。在存在复杂异常值结构的情况下,通过图模型和多元回归的模拟与实证研究进一步证明了所提出方法的优越性。