Objective: The n2c2/UW SDOH Challenge explores the extraction of social determinant of health (SDOH) information from clinical notes. The objectives include the advancement of natural language processing (NLP) information extraction techniques for SDOH and clinical information more broadly. This paper presents the shared task, data, participating teams, performance results, and considerations for future work. Materials and Methods: The task used the Social History Annotated Corpus (SHAC), which consists of clinical text with detailed event-based annotations for SDOH events such as alcohol, drug, tobacco, employment, and living situation. Each SDOH event is characterized through attributes related to status, extent, and temporality. The task includes three subtasks related to information extraction (Subtask A), generalizability (Subtask B), and learning transfer (Subtask C). In addressing this task, participants utilized a range of techniques, including rules, knowledge bases, n-grams, word embeddings, and pretrained language models (LM). Results: A total of 15 teams participated, and the top teams utilized pretrained deep learning LM. The top team across all subtasks used a sequence-to-sequence approach achieving 0.901 F1 for Subtask A, 0.774 F1 Subtask B, and 0.889 F1 for Subtask C. Conclusions: Similar to many NLP tasks and domains, pretrained LM yielded the best performance, including generalizability and learning transfer. An error analysis indicates extraction performance varies by SDOH, with lower performance achieved for conditions, like substance use and homelessness, that increase health risks (risk factors) and higher performance achieved for conditions, like substance abstinence and living with family, that reduce health risks (protective factors).
翻译:目的:n2c2/UW社会健康决定因素(SDOH)挑战赛旨在探索从临床记录中提取社会健康决定因素信息。其目标包括推动面向SDOH及更广泛临床信息的自然语言处理(NLP)信息提取技术发展。本文介绍了该联合任务、数据、参赛团队、性能结果及未来工作展望。材料与方法:本任务使用社会史标注语料库(SHAC),该语料库包含临床文本,并对酒精、药物、烟草、就业及居住状况等SDOH事件进行了细粒度的事件级标注。每个SDOH事件通过状态、程度和时间性相关属性进行表征。任务包含三个子任务:信息提取(子任务A)、泛化能力(子任务B)及迁移学习(子任务C)。参与者运用了规则、知识库、n-gram、词嵌入及预训练语言模型(LM)等多种技术。结果:共15支团队参赛,其中顶尖团队采用预训练深度学习LM。在所有子任务中表现最佳的团队使用序列到序列方法,在子任务A上实现0.901 F1值,子任务B为0.774 F1,子任务C为0.889 F1。结论:与多数NLP任务及领域类似,预训练LM在性能、泛化能力及迁移学习方面均表现最优。错误分析表明,提取性能因SDOH类型而异:对药物滥用、无家可归等增加健康风险(风险因素)的实体提取性能较低,而对戒断物质、与家人同住等降低健康风险(保护因素)的实体提取性能较高。