In AI research, the optimization of Large Language Models (LLMs) remains a significant challenge, crucial for advancing the field's practical applications and sustainability. Building upon the foundational work of Professor Song Han's lab at MIT, this paper introduces a novel approach in developing Mini-GPTs via contextual pruning. Our methodology strategically prunes the computational architecture of traditional LLMs, like Phi-1.5, focusing on retaining core functionalities while drastically reducing model sizes. We employ the technique across diverse and complex datasets, including US law, Medical Q&A, Skyrim dialogue, English-Taiwanese translation, and Economics articles. The results underscore the efficiency and effectiveness of contextual pruning, not merely as a theoretical concept but as a practical tool in developing domain-specific, resource-efficient LLMs. Contextual pruning is a promising method for building domain-specific LLMs, and this research is a building block towards future development with more hardware compute, refined fine-tuning, and quantization.
翻译:在人工智能研究中,大型语言模型的优化仍是一项重大挑战,对于推动该领域实际应用与可持续发展至关重要。基于麻省理工学院宋汉教授实验室的基础性工作,本文提出了一种通过情境剪枝开发Mini-GPTs的新方法。我们的方法论策略性地对传统LLMs(如Phi-1.5)的计算架构进行剪枝,重点关注保留核心功能的同时大幅缩减模型规模。我们将该技术应用于多元复杂数据集,涵盖美国法律、医疗问答、天际对话、英-台翻译及经济学文章。研究结果凸显了情境剪枝的效能与效率——它不仅是理论概念,更是开发领域特化、资源高效型LLMs的实用工具。情境剪枝为构建领域专用LLMs提供了可行路径,本研究将为未来结合更强硬件算力、精细微调与量化技术的发展奠定基础。