Simulation-based inference (SBI) methods such as approximate Bayesian computation (ABC), synthetic likelihood, and neural posterior estimation (NPE) rely on simulating statistics to infer parameters of intractable likelihood models. However, such methods are known to yield untrustworthy and misleading inference outcomes under model misspecification, thus hindering their widespread applicability. In this work, we propose the first general approach to handle model misspecification that works across different classes of SBI methods. Leveraging the fact that the choice of statistics determines the degree of misspecification in SBI, we introduce a regularized loss function that penalises those statistics that increase the mismatch between the data and the model. Taking NPE and ABC as use cases, we demonstrate the superior performance of our method on high-dimensional time-series models that are artificially misspecified. We also apply our method to real data from the field of radio propagation where the model is known to be misspecified. We show empirically that the method yields robust inference in misspecified scenarios, whilst still being accurate when the model is well-specified.
翻译:基于模拟的推断方法,如近似贝叶斯计算(ABC)、合成似然法和神经后验估计(NPE),依赖于模拟统计量来推断难以处理似然模型的参数。然而,已知这些方法在模型误设定下会产生不可信且具有误导性的推断结果,从而阻碍了其广泛应用。本文提出了首个适用于不同类别SBI方法的模型误设定通用处理方案。利用统计量的选择决定SBI中误设定程度这一事实,我们引入了一个正则化损失函数,用于惩罚那些加剧数据与模型之间不匹配的统计量。以NPE和ABC为例,我们在人工误设定的高维时间序列模型上展示了该方法的优越性能。我们还将其应用于无线电传播领域的真实数据,其中已知模型存在误设定。实验表明,该方法在误设定场景下能产生鲁棒的推断结果,同时在模型设定正确时仍保持准确性。