Name tagging is a key component of Information Extraction (IE), particularly in scientific domains such as biomedicine and chemistry, where large language models (LLMs), e.g., ChatGPT, fall short. We investigate the applicability of transfer learning for enhancing a name tagging model trained in the biomedical domain (the source domain) to be used in the chemical domain (the target domain). A common practice for training such a model in a few-shot learning setting is to pretrain the model on the labeled source data, and then, to finetune it on a hand-full of labeled target examples. In our experiments we observed that such a model is prone to mis-labeling the source entities, which can often appear in the text, as the target entities. To alleviate this problem, we propose a model to transfer the knowledge from the source domain to the target domain, however, at the same time, to project the source entities and target entities into separate regions of the feature space. This diminishes the risk of mis-labeling the source entities as the target entities. Our model consists of two stages: 1) entity grouping in the source domain, which incorporates knowledge from annotated events to establish relations between entities, and 2) entity discrimination in the target domain, which relies on pseudo labeling and contrastive learning to enhance discrimination between the entities in the two domains. We carry out our extensive experiments across three source and three target datasets, and demonstrate that our method outperforms the baselines, in some scenarios by 5\% absolute value.
翻译:命名标注是信息抽取(IE)的关键组成部分,尤其在生物医学和化学等科学领域中,大型语言模型(LLM,如ChatGPT)表现不足。我们研究了迁移学习在提升命名标注模型适用性方面的能力——该模型在生物医学领域(源域)训练,并应用于化学领域(目标域)。在少样本学习场景下训练此类模型的常见做法是:先在标注的源域数据上预训练模型,再在少量标注的目标域样本上进行微调。实验发现,这种模型容易将文本中频繁出现的源域实体错误标注为目标域实体。为解决此问题,我们提出一种模型,在将知识从源域迁移至目标域的同时,将源域实体与目标域实体投影至特征空间的不同区域。这降低了将源域实体误标为目标域实体的风险。我们的模型包含两个阶段:1)源域中的实体分组——利用注释事件中的知识建立实体间关系;2)目标域中的实体判别——依靠伪标注和对比学习增强两域实体间的区分能力。我们在三个源域数据集和三个目标域数据集上进行了广泛实验,结果表明,我们的方法在某些场景下比基线模型性能提升5%的绝对值。