Large language models (LLM) are advanced AI systems trained on extensive textual data, leveraging deep learning techniques to understand and generate human-like language. Today's LLMs with billions of parameters are so huge that hardly any single computing node can train, fine-tune, or infer from them. Therefore, several distributed computing techniques are being introduced in the literature to properly utilize LLMs. We have explored the application of distributed computing techniques in LLMs from two angles. \begin{itemize} \item We study the techniques that democratize the LLM, that is, how large models can be run on consumer-grade computers. Here, we also implement a novel metaheuristics-based modification to an existing system. \item We perform a comparative study on three state-of-the-art LLM serving techniques. \end{itemize}
翻译:大型语言模型(LLM)是基于海量文本数据训练的高级人工智能系统,其利用深度学习技术来理解并生成类人语言。当今拥有数十亿参数的LLM规模极为庞大,几乎没有任何单一计算节点能够独立完成其训练、微调或推理任务。因此,文献中正在引入多种分布式计算技术以有效利用LLM。本研究从两个角度探讨了分布式计算技术在LLM中的应用:\begin{itemize} \item 我们研究了实现LLM平民化的技术,即如何在消费级计算机上运行大型模型。在此过程中,我们还对现有系统实现了一种基于元启发式算法的创新性改进。 \item 我们对三种最先进的LLM服务技术进行了比较研究。 \end{itemize}