AutoTrust：自动驾驶大视觉语言模型可信度基准评测 (AutoTrust: Benchmarking Trustworthiness in Large Vision Language Models for Autonomous Driving)

Shuo Xing,Hongyuan Hua,Xiangbo Gao,Shenzhe Zhu,Renjie Li,Kexin Tian,Xiaopeng Li,Heng Huang,Tianbao Yang,Zhangyang Wang,Yang Zhou,Huaxiu Yao,Zhengzhong Tu

from arxiv, Published at TMLR 2025

Recent advancements in large vision language models (VLMs) tailored for autonomous driving (AD) have shown strong scene understanding and reasoning capabilities, making them undeniable candidates for end-to-end driving systems. However, limited work exists on studying the trustworthiness of DriveVLMs -- a critical factor that directly impacts public transportation safety. In this paper, we introduce AutoTrust, a comprehensive trustworthiness benchmark for large vision-language models in autonomous driving (DriveVLMs), considering diverse perspectives -- including trustfulness, safety, robustness, privacy, and fairness. We constructed the largest visual question-answering dataset for investigating trustworthiness issues in driving scenarios, comprising over 10k unique scenes and 18k queries. We evaluated six publicly available VLMs, spanning from generalist to specialist, from open-source to commercial models. Our exhaustive evaluations have unveiled previously undiscovered vulnerabilities of DriveVLMs to trustworthiness threats. Specifically, we found that the general VLMs like LLaVA-v1.6 and GPT-4o-mini surprisingly outperform specialized models fine-tuned for driving in terms of overall trustworthiness. DriveVLMs like DriveLM-Agent are particularly vulnerable to disclosing sensitive information. Additionally, both generalist and specialist VLMs remain susceptible to adversarial attacks and struggle to ensure unbiased decision-making across diverse environments and populations. Our findings call for immediate and decisive action to address the trustworthiness of DriveVLMs -- an issue of critical importance to public safety and the welfare of all citizens relying on autonomous transportation systems. We release all the codes and datasets in https://github.com/taco-group/AutoTrust.

翻译：近年来，针对自动驾驶定制的大视觉语言模型在场景理解与推理方面展现出强大能力，使其成为端到端驾驶系统的有力候选者。然而，针对驾驶视觉语言模型可信度的研究仍十分有限——这一关键因素直接影响公共交通安全性。本文提出AutoTrust，一个面向自动驾驶大视觉语言模型的全方位可信度评测基准，涵盖可信性、安全性、鲁棒性、隐私性与公平性等多维视角。我们构建了目前规模最大的驾驶场景可信度研究视觉问答数据集，包含超过1万个独特场景与1.8万条查询。我们评估了六种公开可用的视觉语言模型，涵盖通用型与专用型、开源模型与商业模型。详尽的评估揭示了驾驶视觉语言模型先前未知的可信度漏洞：通用模型如LLaVA-v1.6和GPT-4o-mini在整体可信度上意外优于专为驾驶微调的专用模型；DriveLM-Agent等驾驶模型特别容易泄露敏感信息；通用与专用模型均易受对抗攻击影响，且难以在不同环境与群体中保证无偏决策。我们的研究结果呼吁采取及时果断的措施来解决驾驶视觉语言模型的可信度问题——这对公共安全及所有依赖自动驾驶系统的公民福祉至关重要。所有代码与数据集已发布于https://github.com/taco-group/AutoTrust。