This paper outlines the performance evaluation of a system for adverse drug event normalization, developed by the Data Science for Digital Health (DS4DH) group for the Social Media Mining for Health Applications (SMM4H) 2023 shared task 5. Shared task 5 targeted the normalization of adverse drug event mentions in Twitter to standard concepts of the Medical Dictionary for Regulatory Activities terminology. Our system hinges on a two-stage approach: BERT fine-tuning for entity recognition, followed by zero-shot normalization using sentence transformers and reciprocal-rank fusion. The approach yielded a precision of 44.9%, recall of 40.5%, and an F1-score of 42.6%. It outperformed the median performance in shared task 5 by 10% and demonstrated the highest performance among all participants. These results substantiate the effectiveness of our approach and its potential application for adverse drug event normalization in the realm of social media text mining.
翻译:本文概述了由数字健康数据科学(DS4DH)团队为社交媒体健康应用挖掘(SMM4H)2023共享任务5开发的药物不良事件规范化系统的性能评估。共享任务5旨在将Twitter中提及的药物不良事件映射至《监管活动医学词典》的标准概念。我们的系统采用两阶段方法:首先通过BERT微调进行实体识别,随后利用句子Transformer和互逆排名融合实现零样本规范化。该方法取得了44.9%的精确率、40.5%的召回率和42.6%的F1分数,较共享任务5的中位性能提升10%,并在所有参与者中表现最优。这些结果验证了该方法的有效性及其在社交媒体文本挖掘领域应用于药物不良事件规范化的潜力。