The integration of Large Language Models (LLMs) into autonomous driving systems offers promising enhancements in environmental understanding and decision-making. However, the substantial computational demands of deploying LLMs locally on vehicles render this approach unfeasible for real-world automotive applications. To address this challenge, we introduce OWLed, the Outlier-Weighed Layerwise Pruning for Efficient Autonomous Driving Framework that leverages outlier-weighted layerwise sparsity for model compression. Our method assigns non-uniform sparsity ratios to different layers based on the distribution of outlier features, significantly reducing the model size without the need for fine-tuning. To ensure the compressed model adapts well to autonomous driving tasks, we incorporate driving environment data into both the calibration and pruning processes. Our empirical studies reveal that the encoder component is more sensitive to pruning than the LLM, highlighting its critical role in the system. Experimental results demonstrate that OWLed outperforms existing methods in perception, action prediction, and language understanding while substantially lowering computational requirements. These findings underscore the potential of combining advanced pruning techniques with LLMs to develop efficient and robust autonomous driving systems capable of handling complex scenarios. Code will be made publicly available.
翻译:将大型语言模型(LLM)集成到自动驾驶系统中,有望显著提升环境理解与决策能力。然而,在车辆本地部署LLM所需的高昂计算成本,使其难以应用于实际汽车场景。为应对这一挑战,我们提出了OWLed(Outlier-Weighed Layerwise Pruning for Efficient Autonomous Driving Framework),该框架利用离群值加权的分层稀疏性进行模型压缩。我们的方法基于离群特征的分布,为不同层分配非均匀的稀疏率,从而在无需微调的情况下显著减小模型规模。为确保压缩模型能良好适应自动驾驶任务,我们将驾驶环境数据同时纳入校准与剪枝过程。实证研究表明,编码器组件比LLM对剪枝更为敏感,凸显了其在系统中的关键作用。实验结果表明,OWLed在感知、行为预测和语言理解任务上均优于现有方法,同时大幅降低了计算需求。这些发现表明,将先进的剪枝技术与LLM相结合,有望开发出能够处理复杂场景的高效、鲁棒的自动驾驶系统。代码将公开提供。