The introduction of ChatGPT has led to a significant increase in the utilization of Large Language Models (LLMs) for addressing downstream tasks. There's an increasing focus on cost-efficient training and deployment within this context. Low-cost training and deployment of LLMs represent the future development trend. This paper reviews the evolution of large language model training techniques and inference deployment technologies aligned with this emerging trend. The discussion on training includes various aspects, including data preprocessing, training architecture, pre-training tasks, parallel training, and relevant content related to model fine-tuning. On the inference side, the paper covers topics such as model compression, parallel computation, memory scheduling, and structural optimization. It also explores LLMs' utilization and provides insights into their future development.
翻译:ChatGPT的引入显著推动了大型语言模型(LLMs)在下游任务中的应用。在此背景下,成本高效的训练与部署日益受到关注。低成本训练与部署LLMs代表了未来的发展趋势。本文回顾了与这一新兴趋势相适应的大型语言模型训练技术与推理部署技术的演进。训练方面的讨论涵盖多个方面,包括数据预处理、训练架构、预训练任务、并行训练以及模型微调的相关内容。在推理方面,本文涉及模型压缩、并行计算、内存调度及结构优化等主题。此外,还探讨了LLMs的应用,并对其未来发展提供了见解。