Effective preference tuning is pivotal in aligning chatbot responses with human expectations, enhancing user satisfaction and engagement. Traditional approaches, notably Reinforcement Learning from Human Feedback (RLHF) as employed in advanced models like GPT-4, have demonstrated considerable success in this domain. However, RLHF methods are often computationally intensive and resource-demanding, limiting their scalability and accessibility for broader applications. To address these challenges, this study introduces LoRA-Lite Ensemble (LoRA-LiteE), an innovative framework that combines Supervised Fine-tuning (SFT) with Low-Rank Adaptation (LoRA) and Ensemble Learning techniques to effectively aggregate predictions of lightweight models, which aim to achieve a balance between the performance and computational cost. Utilizing the Chatbot Arena benchmark dataset, we conduct a comprehensive comparative analysis among our LoRA-LiteE model, corresponding base models at different scales, and GPT-4 trained with RLHF. Our empirical results demonstrate that the proposed LoRA-LiteE model achieves comparable performance to un-finetuned GPT-4 and outperforms the single larger-scale models under limited resource constraints. These findings highlight that our LoRA-LiteE provides a feasible and efficient methodology for human preference prediction in chatbot systems, enhancing scalability and accessibility, and thereby broadening the applicability of preference-tuned chatbots in resource-constrained environments.
翻译:有效的偏好调优对于使聊天机器人响应与人类期望保持一致、提升用户满意度和参与度至关重要。传统方法,尤其是在GPT-4等先进模型中采用的基于人类反馈的强化学习(RLHF),已在该领域展现出显著成效。然而,RLHF方法通常计算密集且资源需求高,限制了其在更广泛应用中的可扩展性和可访问性。为应对这些挑战,本研究提出了LoRA-Lite集成(LoRA-LiteE),这是一个创新框架,它将监督微调(SFT)、低秩自适应(LoRA)与集成学习技术相结合,以有效聚合轻量级模型的预测结果,旨在实现性能与计算成本之间的平衡。利用Chatbot Arena基准数据集,我们在所提出的LoRA-LiteE模型、不同规模的相应基础模型以及采用RLHF训练的GPT-4之间进行了全面的比较分析。我们的实证结果表明,在有限资源约束下,所提出的LoRA-LiteE模型达到了与未经微调的GPT-4相当的性能,并且优于单一更大规模的模型。这些发现表明,我们的LoRA-LiteE为聊天机器人系统中的人类偏好预测提供了一种可行且高效的方法,增强了可扩展性和可访问性,从而拓宽了偏好调优聊天机器人在资源受限环境中的适用性。