Deep learning-based models are at the forefront of most driver observation benchmarks due to their remarkable accuracies but are also associated with high computational costs. This is challenging, as resources are often limited in real-world driving scenarios. This paper introduces a lightweight framework for resource-efficient driver activity recognition. The framework enhances 3D MobileNet, a neural architecture optimized for speed in video classification, by incorporating knowledge distillation and model quantization to balance model accuracy and computational efficiency. Knowledge distillation helps maintain accuracy while reducing the model size by leveraging soft labels from a larger teacher model (I3D), instead of relying solely on original ground truth data. Model quantization significantly lowers memory and computation demands by using lower precision integers for model weights and activations. Extensive testing on a public dataset for in-vehicle monitoring during autonomous driving demonstrates that this new framework achieves a threefold reduction in model size and a 1.4-fold improvement in inference time, compared to an already optimized architecture. The code for this study is available at https://github.com/calvintanama/qd-driver-activity-reco.
翻译:基于深度学习的模型因其卓越的准确率而处于大多数驾驶员监测基准的前沿,但也伴随着高昂的计算成本。由于现实驾驶场景中资源往往有限,这构成了挑战。本文提出了一种轻量级框架,用于实现资源高效的驾驶员行为识别。该框架通过融合知识蒸馏与模型量化技术,对专为视频分类速度优化的神经架构3D MobileNet进行增强,以平衡模型准确率与计算效率。知识蒸馏通过利用大型教师模型(I3D)生成的软标签,而非仅依赖原始真实标注数据,在减小模型规模的同时维持准确率。模型量化则通过采用低精度整数表示模型权重与激活值,显著降低内存与计算需求。在自动驾驶场景下车辆内监测的公开数据集上的大量测试表明,相较于已优化的架构,该新框架实现了模型规模的三倍缩减与推理速度1.4倍的提升。本研究的代码可在https://github.com/calvintanama/qd-driver-activity-reco获取。