In this paper, we propose a novel approach to consider multiple dimensions of relevance beyond topicality in cross-encoder re-ranking. On the one hand, current multidimensional retrieval models often use na\"ive solutions at the re-ranking stage to aggregate multiple relevance scores into an overall one. On the other hand, cross-encoder re-rankers are effective in considering topicality but are not designed to straightforwardly account for other relevance dimensions. To overcome these issues, we envisage enhancing the candidate documents -- which are retrieved by a first-stage lexical retrieval model -- with "relevance statements" related to additional dimensions of relevance and then performing a re-ranking on them with cross-encoders. In particular, here we consider an additional relevance dimension beyond topicality, which is credibility. We test the effectiveness of our solution in the context of the Consumer Health Search task, considering publicly available datasets. Our results show that the proposed approach statistically outperforms both aggregation-based and cross-encoder re-rankers.
翻译:本文提出了一种新方法,在交叉编码器重排序中考虑超越主题相关性的多维度相关性。一方面,当前的多维检索模型在重排序阶段通常采用简单方案,将多个相关性得分聚合为整体分数。另一方面,交叉编码器重排序器虽能有效考虑主题相关性,但并非为直接纳入其他相关性维度而设计。为解决这些问题,我们设想对候选文档(由第一阶段词汇检索模型获取)进行增强,引入与额外相关性维度相关的“相关性陈述”,并通过交叉编码器对其执行重排序。具体而言,本文关注超越主题相关性的额外维度——可信度。我们在消费健康搜索任务背景下,利用公开数据集验证了该方案的有效性。结果表明,所提方法在统计性能上显著优于基于聚合的方法及交叉编码器重排序器。