Recent text-to-image (T2I) models have had great success, and many benchmarks have been proposed to evaluate their performance and safety. However, they only consider explicit prompts while neglecting implicit prompts (hint at a target without explicitly mentioning it). These prompts may get rid of safety constraints and pose potential threats to the applications of these models. This position paper highlights the current state of T2I models toward implicit prompts. We present a benchmark named ImplicitBench and conduct an investigation on the performance and impacts of implicit prompts with popular T2I models. Specifically, we design and collect more than 2,000 implicit prompts of three aspects: General Symbols, Celebrity Privacy, and Not-Safe-For-Work (NSFW) Issues, and evaluate six well-known T2I models' capabilities under these implicit prompts. Experiment results show that (1) T2I models are able to accurately create various target symbols indicated by implicit prompts; (2) Implicit prompts bring potential risks of privacy leakage for T2I models. (3) Constraints of NSFW in most of the evaluated T2I models can be bypassed with implicit prompts. We call for increased attention to the potential and risks of implicit prompts in the T2I community and further investigation into the capabilities and impacts of implicit prompts, advocating for a balanced approach that harnesses their benefits while mitigating their risks.
翻译:近期文本到图像(T2I)模型取得了巨大成功,业界已提出多项基准来评估其性能与安全性。然而,这些基准仅考虑显式提示,而忽略了隐式提示(暗示目标但未明确提及)。此类提示可能绕过安全约束,对模型应用构成潜在威胁。本文聚焦T2I模型对隐式提示的应对现状,提出名为ImplicitBench的基准,并系统考察了主流T2I模型在隐式提示下的性能与影响。具体而言,我们设计并收集了涵盖通用符号、名人隐私及不安全内容(NSFW)三大类别的2000余条隐式提示,评估了六种知名T2I模型在这些提示下的表现。实验结果表明:(1)T2I模型能够根据隐式提示精确生成各类目标符号;(2)隐式提示为T2I模型带来隐私泄露的潜在风险;(3)多数受评T2I模型的NSFW约束可被隐式提示绕过。我们呼吁T2I社区加强对隐式提示潜力与风险的关注,进一步探究其能力与影响,倡导在发挥其优势的同时有效规避风险,实现平衡发展。