Driver Action Recognition (DAR) is crucial in vehicle cabin monitoring systems. In real-world applications, it is common for vehicle cabins to be equipped with cameras featuring different modalities. However, multi-modality fusion strategies for the DAR task within car cabins have rarely been studied. In this paper, we propose a novel yet efficient multi-modality driver action recognition method based on dual feature shift, named DFS. DFS first integrates complementary features across modalities by performing modality feature interaction. Meanwhile, DFS achieves the neighbour feature propagation within single modalities, by feature shifting among temporal frames. To learn common patterns and improve model efficiency, DFS shares feature extracting stages among multiple modalities. Extensive experiments have been carried out to verify the effectiveness of the proposed DFS model on the Drive\&Act dataset. The results demonstrate that DFS achieves good performance and improves the efficiency of multi-modality driver action recognition.
翻译:驾驶员动作识别(DAR)在车辆座舱监控系统中至关重要。实际应用中,车辆座舱常配备不同模态的摄像头,然而针对座舱内DAR任务的多模态融合策略研究尚不充分。本文提出一种基于双特征平移的高效多模态驾驶员动作识别方法——DFS。该方法首先通过跨模态特征交互整合互补特征,同时利用时序帧间的特征平移实现单模态内的邻域特征传播。为学习共性模式并提升模型效率,DFS在多模态间共享特征提取阶段。通过在Drive\&Act数据集上的大量实验验证了DFS模型的有效性,结果表明该方法在实现良好性能的同时,显著提升了多模态驾驶员动作识别的效率。