DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales

Zhewei Yao,Reza Yazdani Aminabadi,Olatunji Ruwase,Samyam Rajbhandari,Xiaoxia Wu,Ammar Ahmad Awan,Jeff Rasley,Minjia Zhang,Conglong Li,Connor Holmes,Zhongzhu Zhou,Michael Wyatt,Molly Smith,Lev Kurilenko,Heyang Qin,Masahiro Tanaka,Shuai Che,Shuaiwen Leon Song,Yuxiong He

from arxiv, 14 pages, 7 figures

ChatGPT-like models have revolutionized various applications in artificial intelligence, from summarization and coding to translation, matching or even surpassing human performance. However, the current landscape lacks an accessible, efficient, and cost-effective end-to-end RLHF (Reinforcement Learning with Human Feedback) training pipeline for these powerful models, particularly when training at the scale of billions of parameters. This paper introduces DeepSpeed-Chat, a novel system that democratizes RLHF training, making it accessible to the AI community. DeepSpeed-Chat offers three key capabilities: an easy-to-use training and inference experience for ChatGPT-like models, a DeepSpeed-RLHF pipeline that replicates the training pipeline from InstructGPT, and a robust DeepSpeed-RLHF system that combines various optimizations for training and inference in a unified way. The system delivers unparalleled efficiency and scalability, enabling training of models with hundreds of billions of parameters in record time and at a fraction of the cost. With this development, DeepSpeed-Chat paves the way for broader access to advanced RLHF training, even for data scientists with limited resources, thereby fostering innovation and further development in the field of AI.

翻译：ChatGPT类模型彻底改变了人工智能领域的各类应用，从摘要生成、代码编写到翻译，其性能已可与人类媲美甚至超越人类。然而，当前缺乏一种可获取、高效且具成本效益的端到端RLHF（基于人类反馈的强化学习）训练流水线，尤其是针对数十亿参数规模的模型训练。本文介绍了DeepSpeed-Chat这一创新系统，它推动了RLHF训练的民主化，使其对人工智能社区更易获取。DeepSpeed-Chat提供三项关键能力：针对ChatGPT类模型的易用训练与推理体验；复现InstructGPT训练流程的DeepSpeed-RLHF流水线；以及将多种训练与推理优化技术统一整合的鲁棒DeepSpeed-RLHF系统。该系统实现了无与伦比的效率与可扩展性，能够以创纪录的时间和极低的成本训练千亿参数规模的模型。通过这一突破，DeepSpeed-Chat为更广泛的先进RLHF训练铺平了道路，即使资源有限的数据科学家也能参与其中，从而推动人工智能领域的创新与进一步发展。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日