This paper presents a novel approach for hazard analysis in dashcam footage, addressing the detection of driver reactions to hazards, the identification of hazardous objects, and the generation of descriptive captions. We first introduce a method for detecting driver reactions through speed and sound anomaly detection, leveraging unsupervised learning techniques. For hazard detection, we employ a set of heuristic rules as weak classifiers, which are combined using an ensemble method. This ensemble approach is further refined with differential privacy to mitigate overconfidence, ensuring robustness despite the lack of labeled data. Lastly, we use state-of-the-art vision-language models for hazard captioning, generating descriptive labels for the detected hazards. Our method achieved the highest scores in the Challenge on Out-of-Label in Autonomous Driving, demonstrating its effectiveness across all three tasks. Source codes are publicly available at https://github.com/ffyyytt/COOOL_2025.
翻译:本文提出了一种用于行车记录仪影像危险分析的新颖方法,旨在检测驾驶员对危险的反应、识别危险物体并生成描述性说明。我们首先引入一种通过速度和声音异常检测来识别驾驶员反应的方法,该方法利用了无监督学习技术。对于危险检测,我们采用一组启发式规则作为弱分类器,并通过集成方法进行组合。该集成方法进一步采用差分隐私进行优化,以缓解过度自信问题,确保在缺乏标注数据的情况下仍具有鲁棒性。最后,我们利用最先进的视觉-语言模型进行危险描述,为检测到的危险生成描述性标签。我们的方法在自动驾驶标签外挑战赛中获得了最高评分,证明了其在所有三项任务中的有效性。源代码已在 https://github.com/ffyyytt/COOOL_2025 公开。