Negative sampling is essential for implicit collaborative filtering to provide proper negative training signals so as to achieve desirable performance. We experimentally unveil a common limitation of all existing negative sampling methods that they can only select negative samples of a fixed hardness level, leading to the false positive problem (FPP) and false negative problem (FNP). We then propose a new paradigm called adaptive hardness negative sampling (AHNS) and discuss its three key criteria. By adaptively selecting negative samples with appropriate hardnesses during the training process, AHNS can well mitigate the impacts of FPP and FNP. Next, we present a concrete instantiation of AHNS called AHNS_{p<0}, and theoretically demonstrate that AHNS_{p<0} can fit the three criteria of AHNS well and achieve a larger lower bound of normalized discounted cumulative gain. Besides, we note that existing negative sampling methods can be regarded as more relaxed cases of AHNS. Finally, we conduct comprehensive experiments, and the results show that AHNS_{p<0} can consistently and substantially outperform several state-of-the-art competitors on multiple datasets.
翻译:负采样对于隐式协同过滤至关重要,它能够提供适当的负训练信号,从而实现理想的性能。我们通过实验揭示现有所有负采样方法的一个普遍局限性:它们只能选择固定硬度级别的负样本,这导致了假正问题(FPP)和假负问题(FNP)。为此,我们提出一种名为自适应硬度负采样(AHNS)的新范式,并讨论其三个关键准则。通过在训练过程中自适应地选择具有适当硬度的负样本,AHNS能有效缓解FPP和FNP的影响。接着,我们给出AHNS的一个具体实例AHNS_{p<0},并从理论上证明AHNS_{p<0}能够很好地满足AHNS的三个准则,并实现归一化折损累积增益的更大下界。此外,我们注意到现有的负采样方法可以视为AHNS的更松弛情况。最后,我们进行了全面的实验,结果表明AHNS_{p<0}在多个数据集上能够持续且显著地优于多个最先进的竞争方法。