An increasing number of individuals are willing to post states and opinions in social media, which has become a valuable data resource for studying human health. Furthermore, social media has been a crucial research point for healthcare now. This paper outlines the methods in our participation in the #SMM4H 2023 Shared Tasks, including data preprocessing, continual pre-training and fine-tuned optimization strategies. Especially for the Named Entity Recognition (NER) task, we utilize the model architecture named W2NER that effectively enhances the model generalization ability. Our method achieved first place in the Task 3. This paper has been peer-reviewed and accepted for presentation at the #SMM4H 2023 Workshop.
翻译:越来越多的人愿意在社交媒体上发布状态和观点,这已成为研究人类健康的宝贵数据资源。此外,社交媒体如今也是医疗保健研究的关键切入点。本文概述了我们参与#SMM4H 2023共享任务的方法,包括数据预处理、持续预训练和微调优化策略。特别是针对命名实体识别(NER)任务,我们采用了名为W2NER的模型架构,有效提升了模型泛化能力。我们的方法在任务3中获得了第一名。本文经同行评审,已被#SMM4H 2023研讨会接收并做报告。