Despite the superior capabilities of Multimodal Large Language Models (MLLMs) across diverse tasks, they still face significant trustworthiness challenges. Yet, current literature on the assessment of trustworthy MLLMs remains limited, lacking a holistic evaluation to offer thorough insights into future improvements. In this work, we establish MultiTrust, the first comprehensive and unified benchmark on the trustworthiness of MLLMs across five primary aspects: truthfulness, safety, robustness, fairness, and privacy. Our benchmark employs a rigorous evaluation strategy that addresses both multimodal risks and cross-modal impacts, encompassing 32 diverse tasks with self-curated datasets. Extensive experiments with 21 modern MLLMs reveal some previously unexplored trustworthiness issues and risks, highlighting the complexities introduced by the multimodality and underscoring the necessity for advanced methodologies to enhance their reliability. For instance, typical proprietary models still struggle with the perception of visually confusing images and are vulnerable to multimodal jailbreaking and adversarial attacks; MLLMs are more inclined to disclose privacy in text and reveal ideological and cultural biases even when paired with irrelevant images in inference, indicating that the multimodality amplifies the internal risks from base LLMs. Additionally, we release a scalable toolbox for standardized trustworthiness research, aiming to facilitate future advancements in this important field. Code and resources are publicly available at: https://multi-trust.github.io/.
翻译:尽管多模态大语言模型(MLLMs)在多样化任务中展现出卓越能力,它们仍面临显著的可信度挑战。然而,当前关于可信MLLMs评估的研究仍然有限,缺乏能够为未来改进提供全面洞见的整体性评估。在本工作中,我们建立了MultiTrust——首个覆盖真实性、安全性、鲁棒性、公平性与隐私性五大核心维度的统一且全面的MLLMs可信度基准。我们的基准采用严谨的评估策略,兼顾多模态风险与跨模态影响,涵盖基于自建数据集的32项差异化任务。通过对21个现代MLLMs的广泛实验,我们揭示了若干先前未被探索的可信度问题与风险:典型专有模型仍难以处理视觉混淆图像,且易受多模态越狱与对抗攻击的影响;MLLMs更倾向于在文本中泄露隐私,并在推理过程中即使配以无关图像仍会显现意识形态与文化偏见,这表明多模态特性放大了基础大语言模型的内部风险。这些发现凸显了多模态引入的复杂性,并强调需要先进方法以提升其可靠性。此外,我们发布了可扩展的标准化可信度研究工具箱,旨在推动这一重要领域的未来发展。代码与资源已公开于:https://multi-trust.github.io/。