In this system report, we describe the models and methods we used for our participation in the PLABA2023 task on biomedical abstract simplification, part of the TAC 2023 tracks. The system outputs we submitted come from the following three categories: 1) domain fine-tuned T5-like models including Biomedical-T5 and Lay-SciFive; 2) fine-tuned BARTLarge model with controllable attributes (via tokens) BART-w-CTs; 3) ChatGPTprompting. We also present the work we carried out for this task on BioGPT finetuning. In the official automatic evaluation using SARI scores, BeeManc ranks 2nd among all teams and our model LaySciFive ranks 3rd among all 13 evaluated systems. In the official human evaluation, our model BART-w-CTs ranks 2nd on Sentence-Simplicity (score 92.84), 3rd on Term-Simplicity (score 82.33) among all 7 evaluated systems; It also produced a high score 91.57 on Fluency in comparison to the highest score 93.53. In the second round of submissions, our team using ChatGPT-prompting ranks the 2nd in several categories including simplified term accuracy score 92.26 and completeness score 96.58, and a very similar score on faithfulness score 95.3 to re-evaluated PLABA-base-1 (95.73) via human evaluations. Our codes, fine-tuned models, prompts, and data splits from the system development stage will be available at https://github.com/ HECTA-UoM/PLABA-MU
翻译:在本系统报告中,我们描述了参与TAC 2023赛道中生物医学摘要简化任务PLABA2023所使用的模型与方法。我们提交的系统输出涵盖以下三类:1) 领域微调的类T5模型,包括Biomedical-T5与Lay-SciFive;2) 通过控制标记实现属性可控的微调BARTLarge模型BART-w-CTs;3) ChatGPT提示工程。我们还介绍了为此任务开展的BioGPT微调工作。在采用SARI指标的官方自动评估中,BeeManc在所有参赛团队中位列第2,我们的LaySciFive模型在所有13个评估系统中排名第3。在官方人工评估中,我们的BART-w-CTs模型在句子简化度(得分92.84)上位列第2,在术语简化度(得分82.33)上位列第3(共7个评估系统);其流畅性得分达91.57,与最高分93.53接近。在第二轮提交中,我们团队采用ChatGPT提示工程的方法在多个评估维度表现优异:简化术语准确度得分92.26(第2名),完整性得分96.58(第2名),忠实度得分95.3与经人工重评的PLABA-base-1(95.73)高度接近。系统开发阶段的代码、微调模型、提示模板及数据划分将在https://github.com/HECTA-UoM/PLABA-MU 公开。