Beyond One-Model-Fits-All: A Survey of Domain Specialization for Large Language Models

Chen Ling,Xujiang Zhao,Jiaying Lu,Chengyuan Deng,Can Zheng,Junxiang Wang,Tanmoy Chowdhury,Yun Li,Hejie Cui,Xuchao Zhang,Tianjiao Zhao,Amit Panalkar,Wei Cheng,Haoyu Wang,Yanchi Liu,Zhengzhang Chen,Haifeng Chen,Chris White,Quanquan Gu,Carl Yang,Liang Zhao

Large language models (LLMs) have significantly advanced the field of natural language processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of applications. The great promise of LLMs as general task solvers motivated people to extend their functionality largely beyond just a ``chatbot'', and use it as an assistant or even replacement for domain experts and tools in specific domains such as healthcare, finance, and education. However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles, caused by the heterogeneity of domain data, the sophistication of domain knowledge, the uniqueness of domain objectives, and the diversity of the constraints (e.g., various social norms, cultural conformity, religious beliefs, and ethical standards in the domain applications). To fill such a gap, explosively-increase research, and practices have been conducted in very recent years on the domain specialization of LLMs, which, however, calls for a comprehensive and systematic review to better summarizes and guide this promising domain. In this survey paper, first, we propose a systematic taxonomy that categorizes the LLM domain-specialization techniques based on the accessibility to LLMs and summarizes the framework for all the subcategories as well as their relations and differences to each other. We also present a comprehensive taxonomy of critical application domains that can benefit from specialized LLMs, discussing their practical significance and open challenges. Furthermore, we offer insights into the current research status and future trends in this area.

翻译：大语言模型（LLM）显著推动了自然语言处理（NLP）领域的发展，为各类应用提供了高度实用且与任务无关的基础支持。LLM作为通用任务求解器的巨大潜力促使人们将其功能大幅拓展至"聊天机器人"之外，将其用作医疗、金融、教育等特定领域中领域专家和工具的助手甚至替代品。然而，直接应用LLM解决特定领域的复杂问题面临诸多障碍，这些问题源于领域数据的异质性、领域知识的复杂性、领域目标的独特性以及约束条件的多样性（例如领域应用中各种社会规范、文化认同、宗教信仰和伦理标准的差异）。为填补这一空白，近年来针对LLM领域专业化的研究与实践呈爆炸式增长，但这一蓬勃发展的领域亟需全面系统的综述来更好总结与指导。在本综述中，我们首先提出一套系统分类法，根据对LLM的访问权限对领域专业化技术进行分类，并总结所有子类别的框架及其相互关系与差异。同时，我们构建了能够受益于专业化LLM的关键应用领域的全面分类体系，探讨其实际意义与开放挑战。最后，我们对当前研究现状及该领域的未来趋势提出见解。