This paper presents a systematic review of the infrastructure requirements for deploying Large Language Models (LLMs) on-device within the context of small and medium-sized enterprises (SMEs), focusing on both hardware and software perspectives. From the hardware viewpoint, we discuss the utilization of processing units like GPUs and TPUs, efficient memory and storage solutions, and strategies for effective deployment, addressing the challenges of limited computational resources typical in SME settings. From the software perspective, we explore framework compatibility, operating system optimization, and the use of specialized libraries tailored for resource-constrained environments. The review is structured to first identify the unique challenges faced by SMEs in deploying LLMs on-device, followed by an exploration of the opportunities that both hardware innovations and software adaptations offer to overcome these obstacles. Such a structured review provides practical insights, contributing significantly to the community by enhancing the technological resilience of SMEs in integrating LLMs.
翻译:本文系统性地综述了在中小企业环境中于设备端部署大语言模型所需的基础设施要求,重点从硬件和软件两个视角展开分析。从硬件视角,我们讨论了GPU和TPU等处理单元的利用、高效的内存与存储解决方案,以及有效的部署策略,以应对中小企业环境中普遍存在的计算资源有限这一挑战。从软件视角,我们探讨了框架兼容性、操作系统优化,以及为资源受限环境定制的专用库的使用。本综述首先识别了中小企业在设备端部署LLM时面临的独特挑战,随后探讨了硬件创新与软件适配为克服这些障碍所带来的机遇。这种结构化的综述提供了实用的见解,通过增强中小企业在集成LLM方面的技术韧性,为相关领域做出了重要贡献。