We introduce Efficient Title Reranker via Broadcasting Query Encoder, a novel title reranking technique to achieve efficient title reranking 20x-40x faster than vanilla passage reranker. However, one of the challenges with the training of Efficient Title Reranker is the instability. Analyzing the issue, we found some very difficult ground truths might act as noisy labels causing accuracy to drop as well as some extreme values in model probability output causing nan. To address these issues, we introduce the Sigmoid Trick, a novel technique that reduces the gradient update of both cases resulting in better retrieval efficacy. Experiments showed the effectiveness of ETR and sigmoid trick as we achieved four state-of-the-art positions on the kilt knowledge benchmark.
翻译:我们提出通过广播查询编码器实现的高效标题重排器(Efficient Title Reranker via Broadcasting Query Encoder),这是一种新颖的标题重排技术,其重排速度比传统段落重排器快20至40倍。然而,该高效标题重排器训练过程中面临不稳定性挑战。通过分析问题,我们发现某些极难标注的真实样本可能作为噪声标签导致精度下降,同时模型概率输出中的极端值会引发nan问题。为解决这些问题,我们引入Sigmoid技巧(Sigmoid Trick)——一种通过减弱两类情况的梯度更新来提升检索效能的新方法。实验表明,ETR与Sigmoid技巧的有效性已通过其在kilt知识基准测试中四项最先进(state-of-the-art)指标的突破得到验证。