Despite the retrieval effectiveness of queries being mutually independent of one another, the evaluation of query performance prediction (QPP) systems has been carried out by measuring rank correlation over an entire set of queries. Such a listwise approach has a number of disadvantages, notably that it does not support the common requirement of assessing QPP for individual queries. In this paper, we propose a pointwise QPP framework that allows us to evaluate the quality of a QPP system for individual queries by measuring the deviations between each prediction versus the corresponding true value, and then aggregating the results over a set of queries. Our experiments demonstrate that this new approach leads to smaller variances in QPP evaluations across a range of different target metrics and retrieval models.
翻译:尽管查询的检索效果彼此独立,但查询性能预测(QPP)系统的评估一直是基于对整个查询集测量排序相关性来完成的。这种列表式方法存在若干不足,尤其是不支持评估单个查询的QPP这一常见需求。本文提出了一种点式QPP框架,通过衡量每个预测值与对应真实值之间的偏差,并随后对一组查询的结果进行聚合,从而能够评估QPP系统对单个查询的质量。实验表明,这种新方法在多种不同目标指标和检索模型下,能够降低QPP评估中的方差波动。