Large Language Models (LLMs) excel at understanding the semantic relationships between queries and documents, even with lengthy and complex long-tail queries. These queries are challenging for feedback-based rankings due to sparse user engagement and limited feedback, making LLMs' ranking ability highly valuable. However, the large size and slow inference of LLMs necessitate the development of smaller, more efficient models (sLLMs). Recently, integrating ranking label generation into distillation techniques has become crucial, but existing methods underutilize LLMs' capabilities and are cumbersome. Our research, RRADistill: Re-Ranking Ability Distillation, propose an efficient label generation pipeline and novel sLLM training methods for both encoder and decoder models. We introduce an encoder-based method using a Term Control Layer to capture term matching signals and a decoder-based model with a ranking layer for enhanced understanding. A/B testing on a Korean-based search platform, validates the effectiveness of our approach in improving re-ranking for long-tail queries.
翻译:大语言模型(LLMs)擅长理解查询与文档之间的语义关系,即使面对冗长且复杂的长尾查询亦是如此。由于用户参与度稀疏且反馈有限,这类查询对于基于反馈的排序方法颇具挑战,因此LLMs的排序能力极具价值。然而,LLMs模型规模庞大、推理速度缓慢,亟需开发更小、更高效的模型(sLLMs)。近期,将排序标签生成融入知识蒸馏技术已变得至关重要,但现有方法未能充分利用LLMs的能力且流程繁琐。本研究提出的RRADistill(重排序能力蒸馏)框架,提出了一种高效的标签生成流程以及适用于编码器和解码器模型的新型sLLM训练方法。我们引入了一种基于编码器的方法,该方法使用术语控制层来捕获术语匹配信号;以及一种基于解码器的模型,该模型配备排序层以增强理解能力。在基于韩语的搜索平台上进行的A/B测试,验证了我们的方法在改善长尾查询重排序方面的有效性。