Large Language Models (LLMs) have recently demonstrated remarkable capabilities in natural language processing tasks and beyond. This success of LLMs has led to a large influx of research contributions in this direction. These works encompass diverse topics such as architectural innovations of the underlying neural networks, context length improvements, model alignment, training datasets, benchmarking, efficiency and more. With the rapid development of techniques and regular breakthroughs in LLM research, it has become considerably challenging to perceive the bigger picture of the advances in this direction. Considering the rapidly emerging plethora of literature on LLMs, it is imperative that the research community is able to benefit from a concise yet comprehensive overview of the recent developments in this field. This article provides that overview to the research community. It not only focuses on a systematic treatment of the existing literature on a broad range of LLM related concept, but also pays special attention to providing comprehensive summaries with extensive details about the individual existing models, datasets and major insights. We also pay heed to aligning our overview with the emerging outlook of this research direction by accounting for the other recently materializing reviews of the broader research direction of LLMs. Our self-contained comprehensive overview of LLMs discusses relevant background concepts along with covering the advanced topics at the frontier of this research direction. This review article is intended to not only provide a systematic survey, but also a quick comprehensive reference for the researchers and practitioners to draw insights from extensive informative summaries of the existing works to advance the LLM research direction.
翻译:大型语言模型(LLMs)近期在自然语言处理任务及其他领域展现出卓越能力。LLMs的成功催生了大量相关研究方向的研究成果。这些工作涵盖广泛主题,包括底层神经网络的架构创新、上下文长度改进、模型对齐、训练数据集、基准测试、效率提升等。随着技术的快速发展和LLM研究的频繁突破,感知该方向进展的整体图景变得相当具有挑战性。鉴于LLM相关文献呈爆炸式增长,研究社区亟需一份简洁而全面的近期发展综述。本文为研究社区提供了这样一份综述。它不仅系统梳理了LLM相关概念领域的现有文献,还特别关注对现有单个模型、数据集和主要见解提供包含详尽细节的综合总结。我们通过纳入近期其他专门针对LLM广泛研究方向的综述性文章,使我们的综述与该研究方向的新兴趋势保持一致。这份自包含的LLM全面综述既讨论了相关背景概念,也涵盖了该研究方向前沿的高级主题。本综述文章旨在不仅提供系统性调研,更为研究人员和实践者提供快速全面的参考资料,使其能够从现有工作的详尽信息总结中汲取洞见,推动LLM研究方向的发展。