A Comprehensive Overview of Large Language Models

Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations of the underlying neural networks, context length improvements, model alignment, training datasets, benchmarking, efficiency and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides that overview to the research community. It not only focuses on a systematic treatment of the existing literature on a broad range of LLM related concept, but also pays special attention to providing comprehensive summaries with extensive details about the individual existing models, datasets and major insights. We also pay heed to aligning our overview with the emerging outlook of this research direction by accounting for the other recently materializing reviews of the broader research direction of LLMs. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of this research direction. This review article is intended to not only provide a systematic survey, but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research direction.

翻译：大型语言模型（LLMs）近期在自然语言处理任务及相关领域展现出卓越能力。LLMs的成功引发了该方向大量研究成果的涌现，涵盖神经网络架构创新、上下文长度优化、模型对齐、训练数据集构建、基准测试、效率提升等多元主题。随着技术快速迭代和LLM研究的突破性进展，研究者们越来越难以全面把握该领域的发展全貌。面对迅速增长的海量LLM文献，为学术界提供一份精炼而全面的最新进展综述显得尤为必要。本文旨在为研究社群提供这样一份综述，不仅系统梳理了涵盖广泛LLM相关概念的现有文献，更对各类具体模型、数据集及核心洞见进行了详尽汇总。我们同时关注该研究方向的新兴趋势，参考了其他近期发布的LLM领域综合性评述。这篇自包含的LLM综述既讨论了相关基础概念，又涵盖了该研究方向前沿的高级议题。本文不仅是一篇系统性调查，更为研究者和从业者提供了快速全面的参考指南，助其从现有工作的丰富信息中汲取洞见，推动LLM研究方向的发展。