Screening documents is a tedious and time-consuming aspect of high-recall retrieval tasks, such as compiling a systematic literature review, where the goal is to identify all relevant documents for a topic. To help streamline this process, many Technology-Assisted Review (TAR) methods leverage active learning techniques to reduce the number of documents requiring review. BERT-based models have shown high effectiveness in text classification, leading to interest in their potential use in TAR workflows. In this paper, we investigate recent work that examined the impact of further pre-training epochs on the effectiveness and efficiency of a BERT-based active learning pipeline. We first report that we could replicate the original experiments on two specific TAR datasets, confirming some of the findings: importantly, that further pre-training is critical to high effectiveness, but requires attention in terms of selecting the correct training epoch. We then investigate the generalisability of the pipeline on a different TAR task, that of medical systematic reviews. In this context, we show that there is no need for further pre-training if a domain-specific BERT backbone is used within the active learning pipeline. This finding provides practical implications for using the studied active learning pipeline within domain-specific TAR tasks.
翻译:文档筛选是高召回检索任务(如系统性文献综述的编制)中一项繁琐且耗时的环节,其目标是识别与某一主题相关的全部文献。为简化这一流程,许多技术辅助审阅(TAR)方法利用主动学习技术来减少需要审阅的文档数量。基于BERT的模型已在文本分类中展现出高效能,引发了对其在TAR工作流中潜在应用的研究兴趣。本文旨在探究近期一项关于进一步预训练轮次对基于BERT的主动学习流水线效能与效率影响的研究。我们首先报告了在两类特定TAR数据集上成功复现原始实验结果,确认了部分关键发现:进一步预训练对获得高有效性至关重要,但需谨慎选择正确的训练轮次。随后,我们考察了该流水线在不同TAR任务(即医学系统性综述)中的泛化能力。结果表明,若在主动学习流水线中使用领域特定的BERT骨干模型,则无需进行进一步预训练。这一发现为在领域特定TAR任务中应用所研究的主动学习流水线提供了实践启示。