Calibration is defined as the ratio of the average predicted click rate to the true click rate. The optimization of calibration is essential to many online advertising recommendation systems because it directly affects the downstream bids in ads auctions and the amount of money charged to advertisers. Despite its importance, calibration optimization often suffers from a problem called "maximization bias". Maximization bias refers to the phenomenon that the maximum of predicted values overestimates the true maximum. The problem is introduced because the calibration is computed on the set selected by the prediction model itself. It persists even if unbiased predictions can be achieved on every datapoint and worsens when covariate shifts exist between the training and test sets. To mitigate this problem, we theorize the quantification of maximization bias and propose a variance-adjusting debiasing (VAD) meta-algorithm in this paper. The algorithm is efficient, robust, and practical as it is able to mitigate maximization bias problems under covariate shifts, neither incurring additional online serving costs nor compromising the ranking performance. We demonstrate the effectiveness of the proposed algorithm using a state-of-the-art recommendation neural network model on a large-scale real-world dataset.
翻译:校准定义为平均预测点击率与真实点击率之比。校准优化对许多在线广告推荐系统至关重要,因为它直接影响广告竞价中的下游出价以及向广告商收取的费用。尽管重要性显著,但校准优化常面临“最大化偏差”问题。最大化偏差指预测值的最大值高估真实最大值的现象。该问题由校准在预测模型自身选定的集合上计算引入,即便在每个数据点上均可实现无偏预测,该问题仍存在,且当训练集与测试集之间存在协变量偏移时会进一步加剧。为缓解该问题,本文从理论上量化了最大化偏差,并提出了一种方差调节去偏(VAD)元算法。该算法高效、鲁棒且实用,既能应对协变量偏移下的最大化偏差问题,又不会增加在线服务成本或损害排序性能。我们通过在大规模真实数据集上使用先进推荐神经网络模型,验证了所提算法的有效性。