Concentration of discrepancy-based approximate Bayesian computation via Rademacher complexity

There has been an increasing interest on summary-free versions of approximate Bayesian computation (ABC), which replace distances among summaries with discrepancies between the empirical distributions of the observed data and the synthetic samples generated under the proposed parameter values. The success of these solutions has motivated theoretical studies on the limiting properties of the induced posteriors. However, current results (i) are often tailored to a specific discrepancy, (ii) require, either explicitly or implicitly, regularity conditions on the data generating process and the assumed statistical model, and (iii) yield bounds depending on sequences of control functions that are not made explicit. As such, there is the lack of a theoretical framework that (i) is unified, (ii) facilitates the derivation of limiting properties that hold uniformly, and (iii) relies on verifiable assumptions that provide concentration bounds clarifying which factors govern the limiting behavior of the ABC posterior. We address this gap via a novel theoretical framework that introduces the concept of Rademacher complexity in the analysis of the limiting properties for discrepancy-based ABC posteriors. This yields a unified theory that relies on constructive arguments and provides more informative asymptotic results and uniform concentration bounds, even in settings not covered by current studies. These advancements are obtained by relating the properties of summary-free ABC posteriors to the behavior of the Rademacher complexity associated with the chosen discrepancy within the family of integral probability semimetrics. This family extends summary-based ABC, and includes the Wasserstein distance and maximum mean discrepancy (MMD), among others. As clarified through a focus on the MMD case and via illustrative simulations, this perspective yields an improved understanding of summary-free ABC.

翻译：近年来，无摘要近似贝叶斯计算（ABC）方法引起了广泛关注，这类方法用观测数据经验分布与在给定参数值下生成的合成样本经验分布之间的差异来替代摘要统计量之间的距离。这些方法的成功促使人们对其诱导后验极限性质展开理论研究。然而，现有结果往往：（i）针对特定差异度量定制；（ii）明确或隐含地对数据生成过程和假设统计模型施加正则性条件；（iii）所导出的界依赖于未显式给出的控制函数序列。因此，目前缺乏一个满足以下条件的理论框架：（i）统一性；（ii）有助于推导均匀成立的极限性质；（iii）基于可验证假设，提供能阐明控制ABC后验极限行为之因素的浓度界。本文通过引入Rademacher复杂度的概念，在分析基于差异的ABC后验极限性质时，提出了一种新的理论框架来填补这一空白。该理论框架基于构造性论证，即使在现有研究未覆盖的设定下，也能提供更具信息量的渐近结果和均匀浓度界。这些进展通过将无摘要ABC后验的性质与积分概率半度量族中选定差异所关联的Rademacher复杂度的行为联系起来而实现。该度量族扩展了基于摘要的ABC，并包括Wasserstein距离和最大均值差异（MMD）等。通过聚焦MMD案例及模拟实验的阐释，这一视角加深了对无摘要ABC的理解。