Information retrieval (IR) evaluation measures are cornerstones for determining the suitability and task performance efficiency of retrieval systems. Their metric and scale properties enable to compare one system against another to establish differences or similarities. Based on the representational theory of measurement, this paper determines these properties by exploiting the information contained in a retrieval measure itself. It establishes the intrinsic framework of a retrieval measure, which is the common scenario when the domain set is not explicitly specified. A method to determine the metric and scale properties of any retrieval measure is provided, requiring knowledge of only some of its attained values. The method establishes three main categories of retrieval measures according to their intrinsic properties. Some common user-oriented and system-oriented evaluation measures are classified according to the presented taxonomy.
翻译:信息检索(IR)评估测度是确定检索系统适用性与任务性能效率的基石。其度量与尺度属性使得系统间的横向比较成为可能,从而揭示其差异与相似性。基于测量的表征理论,本文通过挖掘检索测度自身蕴含的信息来判定这些属性。研究建立了检索测度的内在框架,即定义域集合未明确指定时的通用场景。本文提供了一种仅需部分已知测度值即可确定任意检索测度量与尺度属性的方法。根据测度的内在特性,该方法将检索测度划分为三大主要类别。部分常见的面向用户与面向系统的评估测度依据所提出的分类体系进行了归类。