Biomedical summarization requires large datasets to train for text generation. We show that while transfer learning offers a viable option for addressing this challenge, an in-domain pre-training does not always offer advantages in a BioASQ summarization task. We identify a suitable model architecture and use it to show a benefit of a general-domain pre-training followed by a task-specific fine-tuning in the context of a BioASQ summarization task, leading to a novel three-step fine-tuning approach that works with only a thousand in-domain examples. Our results indicate that a Large Language Model without domain-specific pre-training can have a significant edge in some domain-specific biomedical text generation tasks.
翻译:生物医学摘要需要大型数据集来训练文本生成。我们表明,虽然迁移学习为应对这一挑战提供了可行方案,但领域内预训练在BioASQ摘要任务中并不总能带来优势。我们确定了一种合适的模型架构,并利用该架构展示了在BioASQ摘要任务背景下,通用领域预训练后接任务特定微调的优势,从而提出了一种仅需一千个领域内样本即可运作的新型三步微调方法。我们的结果表明,未经领域特定预训练的大型语言模型在某些领域特定的生物医学文本生成任务中可能具有显著优势。