Accurately predicting the probabilities of user feedback, such as clicks and conversions, is critical for ad ranking and bidding. However, there often exist unwanted mismatches between predicted probabilities and true likelihoods due to the shift of data distributions and intrinsic model biases. Calibration aims to address this issue by post-processing model predictions, and field-aware calibration can adjust model output on different feature field values to satisfy fine-grained advertising demands. Unfortunately, the observed samples corresponding to certain field values can be too limited to make confident calibrations, which may yield bias amplification and online disturbance. In this paper, we propose a confidence-aware multi-field calibration method, which adaptively adjusts the calibration intensity based on the confidence levels derived from sample statistics. It also utilizes multiple feature fields for joint model calibration with awareness of their importance to mitigate the data sparsity effect of a single field. Extensive offline and online experiments show the superiority of our method in boosting advertising performance and reducing prediction deviations.
翻译:准确预测用户反馈(如点击和转化)的概率对于广告排序和竞价至关重要。然而,由于数据分布偏移和模型固有偏差,预测概率与真实可能性之间常常存在不希望的失配。校准旨在通过模型预测后处理解决这一问题,而字段感知校准能够调整不同特征字段值上的模型输出,以满足细粒度的广告需求。不幸的是,对应特定字段值的观测样本可能过于稀少,导致无法进行高置信度的校准,进而可能引发偏差放大和在线扰动。本文提出了一种置信感知多字段校准方法,该方法基于样本统计得出的置信水平自适应调整校准强度,并利用多个特征字段进行联合模型校准,同时感知各字段的重要性以缓解单字段数据稀疏效应。大量离线与在线实验表明,本方法在提升广告性能与减少预测偏差方面具有优越性。