Moderating Illicit Online Image Promotion for Unsafe User-Generated Content Games Using Large Vision-Language Models

Online user generated content games (UGCGs) are increasingly popular among children and adolescents for social interaction and more creative online entertainment. However, they pose a heightened risk of exposure to explicit content, raising growing concerns for the online safety of children and adolescents. Despite these concerns, few studies have addressed the issue of illicit image-based promotions of unsafe UGCGs on social media, which can inadvertently attract young users. This challenge arises from the difficulty of obtaining comprehensive training data for UGCG images and the unique nature of these images, which differ from traditional unsafe content. In this work, we take the first step towards studying the threat of illicit promotions of unsafe UGCGs. We collect a real-world dataset comprising 2,924 images that display diverse sexually explicit and violent content used to promote UGCGs by their game creators. Our in-depth studies reveal a new understanding of this problem and the urgent need for automatically flagging illicit UGCG promotions. We additionally create a cutting-edge system, UGCG-Guard, designed to aid social media platforms in effectively identifying images used for illicit UGCG promotions. This system leverages recently introduced large vision-language models (VLMs) and employs a novel conditional prompting strategy for zero-shot domain adaptation, along with chain-of-thought (CoT) reasoning for contextual identification. UGCG-Guard achieves outstanding results, with an accuracy rate of 94% in detecting these images used for the illicit promotion of such games in real-world scenarios.

翻译：在线用户生成内容游戏（UGCGs）在儿童和青少年中日益流行，成为他们进行社交互动和获取更具创意的在线娱乐途径。然而，这类游戏也带来了接触露骨内容的更高风险，引发了对儿童及青少年网络安全的日益担忧。尽管存在这些担忧，但鲜有研究关注社交媒体上基于图像的非法推广不安全UGCGs问题——此类推广可能无意中吸引年轻用户。这一挑战源于获取UGCG图像的全面训练数据存在困难，且这些图像具有不同于传统不安全内容的独特性。在本研究中，我们首次系统性地探索了不安全UGCGs非法推广的威胁。我们收集了一个包含2924张图像的真实世界数据集，这些图像展示了游戏创作者用于推广UGCGs的多种色情与暴力内容。深入分析揭示了该问题的新认知，以及自动标记非法UGCG推广的迫切需求。我们进一步开发了前沿系统UGCG-Guard，旨在协助社交媒体平台有效识别用于非法UGCG推广的图像。该系统利用最新的大型视觉语言模型（VLMs），并采用新颖的条件提示策略实现零样本领域自适应，结合思维链（CoT）推理进行上下文识别。在真实场景中，UGCG-Guard检测此类非法推广图像的准确率达到94%，表现出色。