Loss functions steer the optimization direction of recommendation models and are critical to model performance, but have received relatively little attention in recent recommendation research. Among various losses, we find Softmax loss (SL) stands out for not only achieving remarkable accuracy but also better robustness and fairness. Nevertheless, the current literature lacks a comprehensive explanation for the efficacy of SL. Toward addressing this research gap, we conduct theoretical analyses on SL and uncover three insights: 1) Optimizing SL is equivalent to performing Distributionally Robust Optimization (DRO) on the negative data, thereby learning against perturbations on the negative distribution and yielding robustness to noisy negatives. 2) Comparing with other loss functions, SL implicitly penalizes the prediction variance, resulting in a smaller gap between predicted values and and thus producing fairer results. Building on these insights, we further propose a novel loss function Bilateral SoftMax Loss (BSL) that extends the advantage of SL to both positive and negative sides. BSL augments SL by applying the same Log-Expectation-Exp structure to positive examples as is used for negatives, making the model robust to the noisy positives as well. Remarkably, BSL is simple and easy-to-implement -- requiring just one additional line of code compared to SL. Experiments on four real-world datasets and three representative backbones demonstrate the effectiveness of our proposal. The code is available at https://github.com/junkangwu/BSL
翻译:损失函数引导推荐模型的优化方向,对模型性能至关重要,但在近期推荐研究中受到的关注相对不足。在各种损失函数中,我们发现Softmax损失(SL)不仅取得了显著精度,还展现出更好的鲁棒性和公平性。然而,现有文献缺乏对SL有效性的全面解释。为填补这一研究空白,我们对SL进行了理论分析并揭示了三点发现:1)优化SL等价于对负样本数据进行分布鲁棒优化(DRO),从而学习对抗负分布上的扰动,对噪声负样本具有鲁棒性。2)与其他损失函数相比,SL隐式惩罚了预测方差,导致预测值之间的差距更小,从而产生更公平的结果。基于这些发现,我们进一步提出新型损失函数——双边Softmax损失(BSL),将SL的优势扩展至正负两侧。BSL通过将负样本使用的对数-期望-指数(Log-Expectation-Exp)结构应用于正样本,增强了SL,使模型同时对噪声正样本具有鲁棒性。值得注意的是,BSL简单易实现——与SL相比仅需增加一行代码。在四个真实数据集和三个代表性主干网络上的实验验证了我们方法的有效性。代码见https://github.com/junkangwu/BSL