Long text summarization, gradually being essential for efficiently processing large volumes of information, stays challenging for Large Language Models (LLMs) such as GPT and LLaMA families because of the insufficient open-sourced training datasets and the high requirement of contextual details dealing. To address the issue, we design a novel zero-shot transfer learning framework, abbreviated as T3, to iteratively training a baseline LLM on an assistant task for the target task, where the former should own richer data resources and share structural or semantic similarity with the latter. In practice, T3 is approached to deal with the long text summarization task by utilizing question answering as the assistant task, and further validated its effectiveness on the BBC summary, NarraSum, FairytaleQA, and NLQuAD datasets, with up to nearly 14% improvement in ROUGE, 35% improvement in BLEU, and 16% improvement in Factscore compared to three baseline LLMs, demonstrating its potential for more assistant-target task combinations.
翻译:长文本摘要对于高效处理海量信息日益重要,但由于开源训练数据集不足且对上下文细节处理要求较高,对GPT和LLaMA系列等大语言模型仍具挑战性。为此,我们设计了一种新型零样本迁移学习框架T3,通过在辅助任务上迭代训练基线大语言模型来适应目标任务,其中辅助任务应具备更丰富的数据资源,并与目标任务存在结构或语义相似性。实践中,我们采用问答任务作为辅助任务处理长文本摘要任务,并在BBC摘要、NarraSum、FairytaleQA和NLQuAD数据集上验证了其有效性:相较于三种基线大语言模型,ROUGE指标最高提升近14%,BLEU指标提升35%,Factscore指标提升16%,这证明了该框架在更多辅助-目标任务组合中的应用潜力。