The recent development of Sora leads to a new era in text-to-video (T2V) generation. Along with this comes the rising concern about its security risks. The generated videos may contain illegal or unethical content, and there is a lack of comprehensive quantitative understanding of their safety, posing a challenge to their reliability and practical deployment. Previous evaluations primarily focus on the quality of video generation. While some evaluations of text-to-image models have considered safety, they cover fewer aspects and do not address the unique temporal risk inherent in video generation. To bridge this research gap, we introduce T2VSafetyBench, a new benchmark designed for conducting safety-critical assessments of text-to-video models. We define 12 critical aspects of video generation safety and construct a malicious prompt dataset using LLMs and jailbreaking prompt attacks. Based on our evaluation results, we draw several important findings, including: 1) no single model excels in all aspects, with different models showing various strengths; 2) the correlation between GPT-4 assessments and manual reviews is generally high; 3) there is a trade-off between the usability and safety of text-to-video generative models. This indicates that as the field of video generation rapidly advances, safety risks are set to surge, highlighting the urgency of prioritizing video safety. We hope that T2VSafetyBench can provide insights for better understanding the safety of video generation in the era of generative AI.
翻译:Sora的最新发展引领了文本到视频(T2V)生成的新时代。随之而来的是对其安全风险的日益关注。生成的视频可能包含非法或不道德内容,并且缺乏对其安全性的全面量化理解,这对其可靠性和实际部署构成了挑战。先前的评估主要关注视频生成的质量。虽然一些文本到图像模型的评估考虑了安全性,但它们涵盖的方面较少,且未解决视频生成固有的独特时间性风险。为了填补这一研究空白,我们引入了T2VSafetyBench,这是一个专为对文本到视频模型进行安全关键评估而设计的新基准。我们定义了视频生成安全的12个关键方面,并利用LLMs和越狱提示攻击构建了一个恶意提示数据集。基于我们的评估结果,我们得出了几个重要发现,包括:1)没有单一模型在所有方面都表现出色,不同模型展现出不同的优势;2)GPT-4评估与人工审核之间的相关性总体较高;3)文本到视频生成模型的可用性与安全性之间存在权衡。这表明,随着视频生成领域的快速发展,安全风险将急剧增加,凸显了优先考虑视频安全性的紧迫性。我们希望T2VSafetyBench能为更好地理解生成式AI时代视频生成的安全性提供见解。