In-cabin driver monitoring systems (DMS) must recognize distraction- and drowsiness-related behaviors with low latency under strict constraints on compute, power, and cost. We present a single-camera in-cabin driver behavior recognition system designed for deployment on two low-cost edge platforms: Raspberry Pi 5 (CPU-only) and the Google Coral development board with an Edge Tensor Processing Unit (Edge TPU) accelerator. The proposed pipeline combines (i) a compact per-frame vision model, (ii) a confounder-aware label taxonomy to reduce confusions among visually similar behaviors, and (iii) a temporal decision head that triggers alerts only when predictions are both confident and sustained. The system supports 17 behavior classes. Training and evaluation use licensed datasets plus in-house collection (over 800,000 labeled frames) with driver-disjoint splits, and we further validate the deployed system in live in-vehicle tests. End-to-end performance reaches approximately 16 FPS on Raspberry Pi 5 using 8-bit integer (INT8) inference (per-frame latency <60 ms) and approximately 25 FPS on Coral Edge TPU (end-to-end latency ~40 ms), enabling real-time monitoring and stable alert generation on embedded hardware. Finally, we discuss how reliable in-cabin perception can serve as an upstream signal for human-centered vehicle intelligence, including emerging agentic vehicle concepts.
翻译:车内驾驶员监控系统(DMS)必须在计算能力、功耗和成本严格受限的条件下,以低延迟识别与分心和困倦相关的行为。我们提出了一种单摄像头车内驾驶员行为识别系统,专为在两种低成本边缘平台上部署而设计:Raspberry Pi 5(仅CPU)和搭载Edge Tensor Processing Unit(Edge TPU)加速器的Google Coral开发板。所提出的流程结合了(i)一个紧凑的逐帧视觉模型,(ii)一个用于减少视觉相似行为间混淆的混杂因素感知标签分类法,以及(iii)一个仅在预测结果既高置信又持续时才触发警报的时序决策头。该系统支持17种行为类别。训练和评估使用了授权数据集以及内部采集的数据(超过800,000个标注帧),并采用驾驶员不相交的数据划分。我们还在实车测试中进一步验证了部署系统的性能。端到端性能在Raspberry Pi 5上使用8位整数(INT8)推理时达到约16 FPS(逐帧延迟<60毫秒),在Coral Edge TPU上达到约25 FPS(端到端延迟约40毫秒),从而能够在嵌入式硬件上实现实时监控和稳定的警报生成。最后,我们讨论了可靠的车内感知如何作为以人为本的车辆智能(包括新兴的智能体化车辆概念)的上游信号。