Unified Sequence Labeling that articulates different sequence labeling problems such as Named Entity Recognition, Relation Extraction, Semantic Role Labeling, etc. in a generalized sequence-to-sequence format opens up the opportunity to make the maximum utilization of large language model knowledge toward structured prediction. Unfortunately, this requires formatting them into specialized augmented format unknown to the base pretrained language model (PLMs) necessitating finetuning to the target format. This significantly bounds its usefulness in data-limited settings where finetuning large models cannot properly generalize to the target format. To address this challenge and leverage PLM knowledge effectively, we propose FISH-DIP, a sample-aware dynamic sparse finetuning strategy that selectively focuses on a fraction of parameters, informed by feedback from highly regressing examples, during the fine-tuning process. By leveraging the dynamism of sparsity, our approach mitigates the impact of well-learned samples and prioritizes underperforming instances for improvement in generalization. Across five tasks of sequence labeling, we demonstrate that FISH-DIP can smoothly optimize the model in low resource settings offering upto 40% performance improvements over full fine-tuning depending on target evaluation settings. Also, compared to in-context learning and other parameter-efficient fine-tuning approaches, FISH-DIP performs comparably or better, notably in extreme low-resource settings.
翻译:统一序列标注将命名实体识别、关系抽取、语义角色标注等不同序列标注问题统一表述为泛化的序列到序列格式,从而为最大化利用大语言模型知识进行结构化预测创造了机会。然而,这需要将这些任务格式化为基础预训练语言模型(PLMs)未知的专用增强格式,因此必须对目标格式进行微调。这一特性严重限制了其在数据有限场景下的实用性——此时对大型模型进行微调无法有效泛化至目标格式。为应对这一挑战并高效利用PLM知识,我们提出FISH-DIP,一种样本感知的动态稀疏微调策略。该策略在微调过程中,通过高回归样本的反馈信号,选择性聚焦于部分参数。通过利用稀疏性的动态特性,我们的方法能削弱已充分学习样本的影响,优先处理表现欠佳的实例以提升泛化能力。在五个序列标注任务上的实验表明,FISH-DIP可在低资源场景下实现模型平滑优化,根据目标评估设置不同,相比全参数微调最高可提升40%的性能。此外,与上下文学习及参数高效微调方法相比,FISH-DIP在极端低资源场景下表现相当或更优。