This paper addresses the challenges of efficiently fine-tuning large language models (LLMs) by exploring data efficiency and hyperparameter optimization. We investigate the minimum data required for effective fine-tuning and propose a novel hyperparameter optimization method that leverages early-stage model performance. Our experiments demonstrate that fine-tuning with as few as 200 samples can improve model accuracy from 70\% to 88\% in a product attribute extraction task. We identify a saturation point of approximately 6,500 samples, beyond which additional data yields diminishing returns. Our proposed bayesian hyperparameter optimization method, which evaluates models at 20\% of total training time, correlates strongly with final model performance, with 4 out of 5 top early-stage models remaining in the top 5 at completion. This approach led to a 2\% improvement in accuracy over baseline models when evaluated on an independent test set. These findings offer actionable insights for practitioners, potentially reducing computational load and dependency on extensive datasets while enhancing overall performance of fine-tuned LLMs.
翻译:本文通过探索数据效率与超参数优化,应对大型语言模型(LLMs)高效微调的挑战。我们研究了有效微调所需的最小数据量,并提出一种利用早期模型性能的新型超参数优化方法。实验表明,在产品属性抽取任务中,仅使用200个样本进行微调即可将模型准确率从70%提升至88%。我们发现了约6,500个样本的饱和点,超出该点后额外数据带来的收益递减。我们提出的贝叶斯超参数优化方法在总训练时长20%的阶段评估模型,其评估结果与最终模型性能高度相关——早期表现最佳的5个模型中有4个在训练完成后仍保持前5名。该方法在独立测试集上评估时,相比基线模型实现了2%的准确率提升。这些发现为实践者提供了可操作的见解,有望在提升微调后LLMs整体性能的同时,降低计算负荷并减少对大规模数据集的依赖。