Embedding-based retrieval aims to learn a shared semantic representation space for both queries and items, enabling efficient and effective item retrieval through approximate nearest neighbor (ANN) algorithms. In current industrial practice, retrieval systems typically retrieve a fixed number of items for each query. However, this fixed-size retrieval often results in insufficient recall for head queries and low precision for tail queries. This limitation largely stems from the dominance of frequentist approaches in loss function design, which fail to address this challenge in industry. In this paper, we propose a novel \textbf{p}robabilistic \textbf{E}mbedding-\textbf{B}ased \textbf{R}etrieval (\textbf{pEBR}) framework. Our method models the item distribution conditioned on each query, enabling the use of a dynamic cosine similarity threshold derived from the cumulative distribution function (CDF) of the probabilistic model. Experimental results demonstrate that pEBR significantly improves both retrieval precision and recall. Furthermore, ablation studies reveal that the probabilistic formulation effectively captures the inherent differences between head-to-tail queries.
翻译:基于嵌入的检索旨在为查询和物品学习一个共享的语义表示空间,从而通过近似最近邻算法实现高效且有效的物品检索。在当前工业实践中,检索系统通常为每个查询检索固定数量的物品。然而,这种固定规模的检索往往导致头部查询的召回率不足,以及尾部查询的精确率低下。这一局限主要源于损失函数设计中频率主义方法的主导地位,其未能有效应对工业中的这一挑战。本文提出了一种新颖的**概率化嵌入检索**框架。我们的方法对以每个查询为条件的物品分布进行建模,从而能够利用从概率模型的累积分布函数导出的动态余弦相似度阈值。实验结果表明,pEBR显著提升了检索的精确率与召回率。此外,消融研究表明,概率化建模有效捕捉了从头部到尾部查询之间的固有差异。