Natural Language Processing (NLP) methods have been broadly applied to clinical tasks. Machine learning and deep learning approaches have been used to improve the performance of clinical NLP. However, these approaches require sufficiently large datasets for training, and trained models have been shown to transfer poorly across sites. These issues have led to the promotion of data collection and integration across different institutions for accurate and portable models. However, this can introduce a form of bias called confounding by provenance. When source-specific data distributions differ at deployment, this may harm model performance. To address this issue, we evaluate the utility of backdoor adjustment for text classification in a multi-site dataset of clinical notes annotated for mentions of substance abuse. Using an evaluation framework devised to measure robustness to distributional shifts, we assess the utility of backdoor adjustment. Our results indicate that backdoor adjustment can effectively mitigate for confounding shift.
翻译:自然语言处理方法已广泛应用于临床任务,机器学习和深度学习方法被用于提升临床NLP性能。然而,这些方法需要足够大的训练数据集,且已证明训练模型在不同医疗机构间迁移能力较差。这些问题促使跨机构数据收集与整合的发展,以构建准确且可移植的模型。但这也引入了一种称为"溯源混杂"的偏差形式——当部署时源特定数据分布存在差异,可能损害模型性能。针对该问题,我们在标注了药物滥用提及的多场所临床笔记数据集中,评估了文本分类中后门调整的有效性。通过设计用于衡量对分布偏移鲁棒性的评估框架,我们验证了后门调整的效用。结果表明,后门调整能有效减轻混杂偏移。