Fine-tuning Large Language Models (LLMs) incurs considerable training costs, driving the need for data-efficient training with optimised data ordering. Human-inspired strategies offer a solution by organising data based on human learning practices. This study evaluates the fine-tuning efficiency of five human-inspired strategies across four language models, three datasets, and both human- and LLM-labelled data in the context of medical question answering. These strategies achieve the best accuracy gain of 1.81% and an average gain of 1.02% across datasets, with interleaved strategies delivering the best average results. However, the best strategy varies across model-dataset combinations, limiting the generalisability of the effects of any single strategy. Additionally, LLM-defined question difficulty outperforms human-defined labels in curriculum-based learning, showing the potential of model-generated data as a cost-effective alternative for optimising fine-tuning.
翻译:微调大型语言模型(LLMs)会产生高昂的训练成本,这促使人们需要采用优化的数据排序来实现数据高效训练。人启发的策略通过基于人类学习实践组织数据提供了一种解决方案。本研究在医学问答背景下,评估了五种人启发策略在四种语言模型、三个数据集以及人类标注和LLM标注数据上的微调效率。这些策略在数据集上实现了最高1.81%的准确率提升,平均提升为1.02%,其中交错策略取得了最佳平均结果。然而,最佳策略因模型-数据集组合而异,这限制了任何单一策略效果的普适性。此外,在基于课程的学习中,LLM定义的问题难度优于人类定义的标签,这表明模型生成的数据作为优化微调的成本效益替代方案具有潜力。