Beyond Relevance: Evaluate and Improve Retrievers on Perspective Awareness

The task of Information Retrieval (IR) requires a system to identify relevant documents based on users' information needs. In real-world scenarios, retrievers are expected to not only rely on the semantic relevance between the documents and the queries but also recognize the nuanced intents or perspectives behind a user query. For example, when asked to verify a claim, a retrieval system is expected to identify evidence from both supporting vs. contradicting perspectives, for the downstream system to make a fair judgment call. In this work, we study whether retrievers can recognize and respond to different perspectives of the queries -- beyond finding relevant documents for a claim, can retrievers distinguish supporting vs. opposing documents? We reform and extend six existing tasks to create a benchmark for retrieval, where we have diverse perspectives described in free-form text, besides root, neutral queries. We show that current retrievers covered in our experiments have limited awareness of subtly different perspectives in queries and can also be biased toward certain perspectives. Motivated by the observation, we further explore the potential to leverage geometric features of retriever representation space to improve the perspective awareness of retrievers in a zero-shot manner. We demonstrate the efficiency and effectiveness of our projection-based methods on the same set of tasks. Further analysis also shows how perspective awareness improves performance on various downstream tasks, with 4.2% higher accuracy on AmbigQA and 29.9% more correlation with designated viewpoints on essay writing, compared to non-perspective-aware baselines.

翻译：信息检索（IR）任务要求系统根据用户的信息需求识别相关文档。在实际场景中，检索器不仅需要依赖文档与查询之间的语义相关性，还需识别用户查询背后微妙的意图或视角。例如，当用户要求验证某个主张时，检索系统应能从支持与反对两种视角识别证据，以便下游系统做出公正判断。本研究探讨检索器能否识别并响应查询中的不同视角——即除寻找与主张相关的文档外，检索器能否区分支持与反对的文档？我们重构并扩展了现有六个任务，构建了一个检索基准，其中包含以自由文本形式描述的多种视角，以及基础中性查询。研究表明，实验中涉及的现有检索器对查询中细微视角差异的感知能力有限，且可能偏向特定视角。基于此观察，我们进一步探索利用检索器表示空间的几何特征，以零样本方式提升其视角感知能力。我们在相同任务集上验证了基于投影的方法的高效性与有效性。进一步分析表明，与非视角感知基线相比，视角感知能力可提升多种下游任务性能：在AmbigQA任务中准确率提高4.2%，在议论文写作任务中与指定观点的相关性提升29.9%。