Unified Multimodal Models (UMMs) offer powerful cross-modality capabilities but introduce new safety risks not observed in single-task models. Despite their emergence, existing safety benchmarks remain fragmented across tasks and modalities, limiting the comprehensive evaluation of complex system-level vulnerabilities. To address this gap, we introduce UniSAFE, the first comprehensive benchmark for system-level safety evaluation of UMMs across 7 I/O modality combinations, spanning conventional tasks and novel multimodal-context image generation settings. UniSAFE is built with a shared-target design that projects common risk scenarios across task-specific I/O configurations, enabling controlled cross-task comparisons of safety failures. Comprising 6,802 curated instances, we use UniSAFE to evaluate 15 state-of-the-art UMMs, both proprietary and open-source. Our results reveal critical vulnerabilities across current UMMs, including elevated safety violations in multi-image composition and multi-turn settings, with image-output tasks consistently more vulnerable than text-output tasks. These findings highlight the need for stronger system-level safety alignment for UMMs. Our code and data are publicly available at https://github.com/segyulee/UniSAFE
翻译:统一多模态模型(UMMs)虽具备强大的跨模态能力,但也带来了单任务模型中未曾观察到的新型安全风险。尽管此类模型已逐渐兴起,现有的安全基准仍分散于不同任务与模态之间,限制了对复杂系统级漏洞的全面评估。为填补这一空白,我们提出了UniSAFE——首个针对UMMs在7种输入/输出模态组合上进行系统级安全性评估的综合基准,涵盖传统任务与新颖的多模态上下文图像生成场景。UniSAFE采用共享目标设计,将共同的风险场景映射至特定任务的输入/输出配置中,从而实现对安全失效的跨任务可控比较。通过包含6,802个精选实例,我们利用UniSAFE评估了15个当前最先进的专有及开源UMMs。研究结果揭示了当前UMMs普遍存在的关键脆弱性,包括在多图像组合与多轮对话场景中安全违规率显著升高,且图像输出任务的安全性始终弱于文本输出任务。这些发现凸显了加强UMMs系统级安全对齐的迫切需求。我们的代码与数据已公开于https://github.com/segyulee/UniSAFE。