This work presents a dynamic vocabulary adaptation strategy, MEDVOC, for fine-tuning pre-trained language models (PLMs) like BertSumAbs, BART, and PEGASUS for improved medical text summarization. In contrast to existing domain adaptation approaches in summarization, MEDVOC treats vocabulary as an optimizable parameter and optimizes the PLM vocabulary based on fragment score conditioned only on the downstream task's reference summaries. Unlike previous works on vocabulary adaptation (limited only to classification tasks), optimizing vocabulary based on summarization tasks requires an extremely costly intermediate fine-tuning step on large summarization datasets. To that end, our novel fragment score-based hyperparameter search very significantly reduces this fine-tuning time -- from 450 days to less than 2 days on average. Furthermore, while previous works on vocabulary adaptation are often primarily tied to single PLMs, MEDVOC is designed to be deployable across multiple PLMs (with varying model vocabulary sizes, pre-training objectives, and model sizes) -- bridging the limited vocabulary overlap between the biomedical literature domain and PLMs. MEDVOC outperforms baselines by 15.74% in terms of Rouge-L in zero-shot setting and shows gains of 17.29% in high Out-Of-Vocabulary (OOV) concentrations. Our human evaluation shows MEDVOC generates more faithful medical summaries (88% compared to 59% in baselines). We make the codebase publicly available at https://github.com/gb-kgp/MEDVOC.
翻译:本文提出一种动态词汇适配策略MEDVOC,用于微调BertSumAbs、BART及PEGASUS等预训练语言模型以改进医学文本摘要任务。与现有摘要领域适配方法不同,MEDVOC将词汇表视为可优化参数,并基于仅依赖下游任务参考摘要的片段得分对PLM词汇表进行优化。区别于以往仅限分类任务的词汇适配研究,基于摘要任务的词汇优化需要在大型摘要数据集上执行代价高昂的中间微调步骤。为此,本文提出的基于片段得分的超参数搜索方法可显著降低该微调时间——从平均450天缩减至不足2天。此外,虽然既有词汇适配工作通常与单一PLM紧密耦合,但MEDVOC被设计为可跨多种PLM部署(涵盖不同模型词汇表规模、预训练目标和模型尺寸),从而弥合生物医学文献领域与PLM之间有限的词汇重叠。在零样本设定下,MEDVOC的Rouge-L指标较基线提升15.74%,在高词汇缺失(OOV)浓度场景中提升17.29%。人工评估表明,MEDVOC生成的医学摘要更忠实于原文(准确率88%对比基线的59%)。相关代码已在https://github.com/gb-kgp/MEDVOC 开源。