LongEval-Retrieval: French-English Dynamic Test Collection for Continuous Web Search Evaluation

LongEval-Retrieval is a Web document retrieval benchmark that focuses on continuous retrieval evaluation. This test collection is intended to be used to study the temporal persistence of Information Retrieval systems and will be used as the test collection in the Longitudinal Evaluation of Model Performance Track (LongEval) at CLEF 2023. This benchmark simulates an evolving information system environment - such as the one a Web search engine operates in - where the document collection, the query distribution, and relevance all move continuously, while following the Cranfield paradigm for offline evaluation. To do that, we introduce the concept of a dynamic test collection that is composed of successive sub-collections each representing the state of an information system at a given time step. In LongEval-Retrieval, each sub-collection contains a set of queries, documents, and soft relevance assessments built from click models. The data comes from Qwant, a privacy-preserving Web search engine that primarily focuses on the French market. LongEval-Retrieval also provides a 'mirror' collection: it is initially constructed in the French language to benefit from the majority of Qwant's traffic, before being translated to English. This paper presents the creation process of LongEval-Retrieval and provides baseline runs and analysis.

翻译：LongEval-Retrieval是一个聚焦于持续检索评估的网页文档基准测试集。该测试集旨在用于研究信息检索系统的时间持久性，并将作为CLEF 2023纵向模型性能评估赛道（LongEval）的测试集。该基准模拟了一个不断演进的信息系统环境——例如网页搜索引擎运行的环境——其中文档集合、查询分布及相关性均持续变化，同时遵循克兰菲尔德范式进行离线评估。为此，我们引入了动态测试集的概念，该测试集由连续的子集合构成，每个子集合代表信息系统在特定时间步的状态。在LongEval-Retrieval中，每个子集合包含一组基于点击模型构建的查询、文档及软相关性评估。数据来源于Qwant——一家主要面向法国市场的隐私保护型网页搜索引擎。LongEval-Retrieval还提供了一份"镜像"集合：初始以法语构建以充分利用Qwant的大部分流量，随后翻译为英语。本文介绍了LongEval-Retrieval的创建过程，并提供了基线运行结果与分析。

相关内容

Continuity

关注 4

让 iOS 8 和 OS X Yosemite 无缝切换的一个新特性。 > Apple products have always been designed to work together beautifully. But now they may really surprise you. With iOS 8 and OS X Yosemite, you’ll be able to do more wonderful things than ever before.

Source: Apple - iOS 8

NeurlPS 2022 | 自然语言处理相关论文分类整理

专知会员服务

51+阅读 · 2022年10月2日

高效可扩展图神经网络的研究进展，Recent Advances in Efficient and Scalable Graph Neural Networks

专知会员服务

78+阅读 · 2022年3月15日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日