Technology Assisted Review (TAR) stopping rules aim to reduce the cost of manually assessing documents for relevance by minimising the number of documents that need to be examined to ensure a desired level of recall. This paper extends an effective stopping rule using information derived from a text classifier that can be trained without the need for any additional annotation. Experiments on multiple data sets (CLEF e-Health, TREC Total Recall, TREC Legal and RCV1) showed that the proposed approach consistently improves performance and outperforms several alternative methods.
翻译:技术辅助审查(TAR)停止规则旨在通过最小化确保达到预期召回率所需审查的文档数量,降低人工评估文档相关性的成本。本文利用从文本分类器中获取的信息(该分类器无需额外标注即可训练)扩展了一种有效的停止规则。在多个数据集(CLEF e-Health、TREC Total Recall、TREC Legal 和 RCV1)上的实验表明,所提出的方法能够持续提升性能,并优于多种替代方法。