While entity-oriented neural IR models have advanced significantly, they often overlook a key nuance: the varying degrees of influence individual entities within a document have on its overall relevance. Addressing this gap, we present DREQ, an entity-oriented dense document re-ranking model. Uniquely, we emphasize the query-relevant entities within a document's representation while simultaneously attenuating the less relevant ones, thus obtaining a query-specific entity-centric document representation. We then combine this entity-centric document representation with the text-centric representation of the document to obtain a "hybrid" representation of the document. We learn a relevance score for the document using this hybrid representation. Using four large-scale benchmarks, we show that DREQ outperforms state-of-the-art neural and non-neural re-ranking methods, highlighting the effectiveness of our entity-oriented representation approach.
翻译:尽管面向实体的神经信息检索模型取得了显著进展,但它们常常忽略一个关键细微差别:文档中单个实体对其整体相关性的影响程度存在差异。针对这一不足,我们提出了DREQ,一种面向实体的稠密文档重排序模型。该模型的独特之处在于,它在增强文档表示中与查询相关实体的同时,削弱相关性较低的实体,从而获得查询特定的以实体为中心的文档表示。随后,我们将这种以实体为中心的文档表示与以文本为中心的文档表示相结合,得到文档的"混合"表示,并利用该混合表示学习文档的相关性得分。通过在四个大规模基准数据集上的实验,我们证明DREQ优于最先进的神经与非神经重排序方法,凸显了面向实体表示方法的有效性。