Few-shot named entity recognition (NER) has shown remarkable progress in identifying entities in low-resource domains. However, few-shot NER methods still struggle with out-of-domain (OOD) examples due to their reliance on manual labeling for the target domain. To address this limitation, recent studies enable generalization to an unseen target domain with only a few labeled examples using data augmentation techniques. Two important challenges remain: First, augmentation is limited to the training data, resulting in minimal overlap between the generated data and OOD examples. Second, knowledge transfer is implicit and insufficient, severely hindering model generalizability and the integration of knowledge from the source domain. In this paper, we propose a framework, prompt learning with type-related features (PLTR), to address these challenges. To identify useful knowledge in the source domain and enhance knowledge transfer, PLTR automatically extracts entity type-related features (TRFs) based on mutual information criteria. To bridge the gap between training and OOD data, PLTR generates a unique prompt for each unseen example by selecting relevant TRFs. We show that PLTR achieves significant performance improvements on in-domain and cross-domain datasets. The use of PLTR facilitates model adaptation and increases representation similarities between the source and unseen domains.
翻译:少样本命名实体识别(NER)在低资源领域识别实体方面取得了显著进展。然而,由于依赖目标领域的人工标注,少样本NER方法仍难以处理域外示例。为解决该局限,近期研究通过数据增强技术,仅使用少量标注示例实现向未见目标领域的泛化。当前仍面临两个重要挑战:首先,增强仅局限于训练数据,导致生成数据与域外示例的重叠度极低;其次,知识迁移隐式且不充分,严重阻碍了模型泛化能力及源领域知识的整合。本文提出基于类型相关特征的提示学习框架(PLTR)以应对上述挑战。为识别源领域中的有用知识并增强知识迁移,PLTR基于互信息准则自动提取实体类型相关特征(TRFs)。为弥合训练数据与域外数据之间的差距,PLTR通过选择相关TRF为每个未见示例生成独特提示。实验表明,PLTR在领域内和跨领域数据集上均实现了显著的性能提升。PLTR的应用促进了模型适应,并增强了源领域与未见领域之间的表示相似性。