Despite being trained on balanced datasets, existing AI-generated image detectors often exhibit systematic bias at test time, frequently misclassifying fake images as real. We hypothesize that this behavior stems from distributional shift in fake samples and implicit priors learned during training. Specifically, models tend to overfit to superficial artifacts that do not generalize well across different generation methods, leading to a misaligned decision threshold when faced with test-time distribution shift. To address this, we propose a theoretically grounded post-hoc calibration framework based on Bayesian decision theory. In particular, we introduce a learnable scalar correction to the model's logits, optimized on a small validation set from the target distribution while keeping the backbone frozen. This parametric adjustment compensates for distributional shift in model output, realigning the decision boundary even without requiring ground-truth labels. Experiments on challenging benchmarks show that our approach significantly improves robustness without retraining, offering a lightweight and principled solution for reliable and adaptive AI-generated image detection in the open world. Code is available at https://github.com/muliyangm/AIGI-Det-Calib.
翻译:尽管在平衡数据集上训练,现有AI生成图像检测器在测试时仍常表现出系统性偏差,频繁将伪造图像误判为真实图像。我们假设该行为源于伪造样本的分布偏移及训练期间习得的隐式先验。具体而言,模型倾向于过度拟合无法在不同生成方法间良好泛化的表面伪影,导致面对测试时分布偏移时决策阈值失准。为解决此问题,我们提出基于贝叶斯决策理论的、具有理论依据的事后校准框架。特别地,我们引入对模型逻辑值的可学习标量校正,该校正通过在目标分布的小型验证集上优化获得,同时保持主干网络冻结。这种参数化调整补偿了模型输出的分布偏移,即使无需真实标签也能重新对齐决策边界。在具有挑战性的基准测试上的实验表明,我们的方法无需重新训练即可显著提升鲁棒性,为开放世界中可靠且自适应的AI生成图像检测提供了轻量级、原理性解决方案。代码发布于https://github.com/muliyangm/AIGI-Det-Calib。