As the amount and complexity of available data increases, the need for robust statistical learning becomes more pressing. To enhance resilience against model misspecification, the generalized posterior inference method adjusts the likelihood term by exponentiating it with a learning rate, thereby fine-tuning the dispersion of the posterior distribution. This study proposes a computationally efficient strategy for selecting an appropriate learning rate. The proposed approach builds upon the generalized posterior calibration (GPC) algorithm, which is designed to select a learning rate that ensures nominal frequentist coverage. This algorithm, which evaluates the coverage probability using bootstrap samples, has high computational costs because of the repeated posterior simulations needed for bootstrap samples. To address this limitation, the study proposes an algorithm that combines elements of the GPC algorithm with the sequential Monte Carlo (SMC) sampler. By leveraging the similarity between the learning rate in generalized posterior inference and the inverse temperature in SMC sampling, the proposed algorithm efficiently calibrates the posterior distribution with a reduced computational cost. For demonstration, the proposed algorithm was applied to several statistical learning models and shown to be significantly faster than the original GPC.
翻译:随着可用数据量及其复杂性的持续增长,对稳健统计学习的需求日益迫切。为增强对模型误设的鲁棒性,广义后验推断方法通过引入学习率对似然项进行指数化调整,从而微调后验分布的离散程度。本研究提出一种计算高效的策略以选择合适的学习率。该策略基于广义后验校准(GPC)算法,该算法旨在选择能确保名义频率覆盖的学习率。由于需通过自助法样本重复模拟后验分布评估覆盖概率,原始GPC算法计算成本极高。为解决这一局限,本研究提出一种融合广义后验校准算法与序贯蒙特卡罗(SMC)采样器的新算法。通过利用广义后验推断中的学习率与SMC采样中逆温度参数的相似性,该算法能够以更低计算成本高效校准后验分布。实验表明,相比原始GPC算法,所提算法在多种统计学习模型中的应用显著提升了计算速度。