Generalization to unseen concepts is a central challenge due to the scarcity of human annotations in Mention-agnostic Biomedical Concept Recognition (MA-BCR). This work makes two key contributions to systematically address this issue. First, we propose an evaluation framework built on hierarchical concept indices and novel metrics to measure generalization. Second, we explore LLM-based Auto-Labeled Data (ALD) as a scalable resource, creating a task-specific pipeline for its generation. Our research unequivocally shows that while LLM-generated ALD cannot fully substitute for manual annotations, it is a valuable resource for improving generalization, successfully providing models with the broader coverage and structural knowledge needed to approach recognizing unseen concepts. Code and datasets are available at https://github.com/bio-ie-tool/hi-ald.
翻译:在提及无关的生物医学概念识别(MA-BCR)任务中,由于人工标注数据的稀缺,模型对未见概念的泛化能力面临核心挑战。本研究针对该问题提出了两项关键贡献。首先,我们构建了一个基于层级概念索引及新型度量指标的评估框架,用以系统衡量模型的泛化性能。其次,我们探索了基于大语言模型的自动标注数据(ALD)作为一种可扩展资源,并设计了一个面向该任务的专用生成流程。研究明确表明,尽管大语言模型生成的自动标注数据无法完全替代人工标注,但其作为提升模型泛化能力的有效资源具有重要价值,能够为模型提供更广泛的覆盖范围和结构化知识,从而助力模型更好地识别未见概念。相关代码与数据集已在 https://github.com/bio-ie-tool/hi-ald 公开。