T2VSafetyBench: Evaluating the Safety of Text-to-Video Generative Models

The recent development of Sora leads to a new era in text-to-video (T2V) generation. Along with this comes the rising concern about its security risks. The generated videos may contain illegal or unethical content, and there is a lack of comprehensive quantitative understanding of their safety, posing a challenge to their reliability and practical deployment. Previous evaluations primarily focus on the quality of video generation. While some evaluations of text-to-image models have considered safety, they cover fewer aspects and do not address the unique temporal risk inherent in video generation. To bridge this research gap, we introduce T2VSafetyBench, a new benchmark designed for conducting safety-critical assessments of text-to-video models. We define 12 critical aspects of video generation safety and construct a malicious prompt dataset using LLMs and jailbreaking prompt attacks. Based on our evaluation results, we draw several important findings, including: 1) no single model excels in all aspects, with different models showing various strengths; 2) the correlation between GPT-4 assessments and manual reviews is generally high; 3) there is a trade-off between the usability and safety of text-to-video generative models. This indicates that as the field of video generation rapidly advances, safety risks are set to surge, highlighting the urgency of prioritizing video safety. We hope that T2VSafetyBench can provide insights for better understanding the safety of video generation in the era of generative AI.

翻译：Sora的最新发展引领了文本到视频（T2V）生成的新时代。随之而来的是对其安全风险的日益关注。生成的视频可能包含非法或不道德内容，并且缺乏对其安全性的全面量化理解，这对其可靠性和实际部署构成了挑战。先前的评估主要关注视频生成的质量。虽然一些文本到图像模型的评估考虑了安全性，但它们涵盖的方面较少，且未解决视频生成固有的独特时间性风险。为了填补这一研究空白，我们引入了T2VSafetyBench，这是一个专为对文本到视频模型进行安全关键评估而设计的新基准。我们定义了视频生成安全的12个关键方面，并利用LLMs和越狱提示攻击构建了一个恶意提示数据集。基于我们的评估结果，我们得出了几个重要发现，包括：1）没有单一模型在所有方面都表现出色，不同模型展现出不同的优势；2）GPT-4评估与人工审核之间的相关性总体较高；3）文本到视频生成模型的可用性与安全性之间存在权衡。这表明，随着视频生成领域的快速发展，安全风险将急剧增加，凸显了优先考虑视频安全性的紧迫性。我们希望T2VSafetyBench能为更好地理解生成式AI时代视频生成的安全性提供见解。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日