贝叶斯与频率学派异常值鲁棒性方法的调和 (Reconciliating Bayesian and frequentist approaches to robustness against outliers)

Heavy-tailed models are used as a way to gain robustness against outliers in Bayesian analyses. In frequentist analyses, M-estimators are often employed. In this paper, the two approaches are tentatively reconciled by considering M-estimators as maximum likelihood estimators of heavy-tailed models. From this perspective, it is realized that a fundamental difference exists as frequentists, contrarily to Bayesians, do not require these heavy-tailed models to be proper. For instance, a popular robust estimator in linear regression, Tukey's biweight M-estimator, does not correspond to a proper heavy-tailed model. Thus, a Bayesian practitioner does not have access to the same range of tools as a frequentist practitioner. It is shown through two real-data linear regression analyses that the former may in consequence obtain significantly different estimation results than the latter, where the difference is due to a more pronounced influence by the outliers in the former case. It is highlighted that a way to give these practitioners access to the same range of tools is for the Bayesian to adopt the generalized Bayesian framework of Bissiri et al. (2016) which allows the use of improper models (Jewson and Rossell, 2022), in combination with proper prior distributions yielding proper generalized posterior distributions. A complete reconciliation of the Bayesian and frequentist approaches to robustness is then achieved. An extensive theoretical study of the generalized Bayesian counterpart of Tukey's biweight M-estimator is provided, which includes a robustness characterization result and a Bernstein--von Mises result, the latter allowing to calibrate the generalized posterior distribution for meaningful uncertainty quantification. After adopting the generalized Bayesian framework, the Bayesian practitioner obtains similar results as the frequentist practitioner in the aforementioned examples.

翻译：重尾模型常被用作贝叶斯分析中获得异常值鲁棒性的手段，而频率学派分析中则常采用M估计量。本文通过将M估计量视为重尾模型的最大似然估计量，尝试调和这两种方法。由此视角发现，与贝叶斯学派相反，频率学派并不要求这些重尾模型是正规模型，这构成了根本性差异。例如，线性回归中常用的稳健估计量——Tukey双权重M估计量——并不对应正规的重尾模型。因此，贝叶斯实践者无法使用与频率学派实践者相同的工具集。通过两个真实数据的线性回归分析表明，贝叶斯实践者可能因此获得与频率学派显著不同的估计结果，这种差异源于前者受异常值影响更为显著。研究指出，要使贝叶斯实践者获得相同的工具集，可采用Bissiri等人（2016）提出的广义贝叶斯框架（Jewson与Rossell，2022），该框架允许使用非正规模型，并结合能产生正规广义后验分布的正规先验分布。由此实现了贝叶斯与频率学派鲁棒性方法的完全调和。本文对Tukey双权重M估计量的广义贝叶斯对应模型进行了深入的理论研究，包括鲁棒性表征结果和Bernstein--von Mises定理的证明，后者可为广义后验分布提供有意义的 uncertainty quantification 校准。在采用广义贝叶斯框架后，贝叶斯实践者在上述示例中获得了与频率学派实践者相似的结果。