Large Language Models (LLMs) trained on extensive textual corpora have emerged as leading solutions for a broad array of Natural Language Processing (NLP) tasks. Despite their notable performance, these models are prone to certain limitations such as misunderstanding human instructions, generating potentially biased content, or factually incorrect (hallucinated) information. Hence, aligning LLMs with human expectations has become an active area of interest within the research community. This survey presents a comprehensive overview of these alignment technologies, including the following aspects. (1) Data collection: the methods for effectively collecting high-quality instructions for LLM alignment, including the use of NLP benchmarks, human annotations, and leveraging strong LLMs. (2) Training methodologies: a detailed review of the prevailing training methods employed for LLM alignment. Our exploration encompasses Supervised Fine-tuning, both Online and Offline human preference training, along with parameter-efficient training mechanisms. (3) Model Evaluation: the methods for evaluating the effectiveness of these human-aligned LLMs, presenting a multifaceted approach towards their assessment. In conclusion, we collate and distill our findings, shedding light on several promising future research avenues in the field. This survey, therefore, serves as a valuable resource for anyone invested in understanding and advancing the alignment of LLMs to better suit human-oriented tasks and expectations. An associated GitHub link collecting the latest papers is available at https://github.com/GaryYufei/AlignLLMHumanSurvey.
翻译:基于海量文本语料训练的大语言模型(LLMs)已成为众多自然语言处理(NLP)任务的主流解决方案。尽管这些模型表现卓越,但仍存在某些局限性,例如误解人类指令、生成潜在偏见内容或事实性错误(幻觉)信息。因此,使大语言模型与人类期望对齐已成为研究界活跃的领域。本综述全面概述了这些对齐技术,涵盖以下方面:(1)数据收集:高效收集用于大语言模型对齐的高质量指令的方法,包括利用NLP基准测试、人工标注及借助强大大语言模型;(2)训练方法论:详细回顾了用于大语言模型对齐的主流训练方法,涵盖监督微调、在线与离线人类偏好训练,以及参数高效训练机制;(3)模型评估:评估这些人类对齐大语言模型效果的方法,呈现多维度评估策略。最后,我们整合并提炼研究发现,揭示了该领域若干有前景的未来研究方向。本综述因此为致力于理解并推进大语言模型对齐以更好适应人类导向任务与期望的研究者提供了宝贵资源。相关论文集成的GitHub链接见https://github.com/GaryYufei/AlignLLMHumanSurvey。