ArtifactLens: Hundreds of Labels Are Enough for Artifact Detection with VLMs

Modern image generators produce strikingly realistic images, where only artifacts like distorted hands or warped objects reveal their synthetic origin. Detecting these artifacts is essential: without detection, we cannot benchmark generators or train reward models to improve them. Current detectors fine-tune VLMs on tens of thousands of labeled images, but this is expensive to repeat whenever generators evolve or new artifact types emerge. We show that pretrained VLMs already encode the knowledge needed to detect artifacts - with the right scaffolding, this capability can be unlocked using only a few hundred labeled examples per artifact category. Our system, ArtifactLens, achieves state-of-the-art on five human artifact benchmarks (the first evaluation across multiple datasets) while requiring orders of magnitude less labeled data. The scaffolding consists of a multi-component architecture with in-context learning and text instruction optimization, with novel improvements to each. Our methods generalize to other artifact types - object morphology, animal anatomy, and entity interactions - and to the distinct task of AIGC detection.

翻译：现代图像生成器能产生极其逼真的图像，只有扭曲的手或变形的物体等伪影会揭示其合成来源。检测这些伪影至关重要：若无检测，我们便无法对生成器进行基准测试，也无法训练奖励模型以改进它们。当前的检测器通常在数万张标注图像上微调视觉语言模型，但每当生成器演进或新的伪影类型出现时，重复这一过程成本高昂。我们证明，预训练的视觉语言模型已编码了检测伪影所需的知识——通过合适的框架，仅需每个伪影类别数百个标注示例即可解锁此能力。我们的系统 ArtifactLens 在五个人工伪影基准测试中（首个跨多个数据集的评估）达到了最先进的性能，同时所需的标注数据量减少了数个数量级。该框架包含一个具有上下文学习和文本指令优化的多组件架构，并对各部分进行了新颖的改进。我们的方法可推广至其他伪影类型——物体形态、动物解剖结构和实体交互——以及人工智能生成内容检测这一不同任务。

相关内容

生成器

关注 2

生成器是一次生成一个值的特殊类型函数。可以将其视为可恢复函数。调用该函数将返回一个可用于生成连续 x 值的生成【Generator】，简单的说就是在函数的执行过程中，yield语句会把你需要的值返回给调用生成器的地方，然后退出函数，下一次调用生成器函数的时候又从上次中断的地方开始执行，而生成器内的所有变量参数都会被保存下来供下一次使用。

【ICCV2025】AIGI-Holmes：面向可解释性与可泛化性的AI生成图像检测方法 —— 基于多模态大语言模型的研究

专知会员服务

10+阅读 · 2025年7月4日

【ICML2025】通过概念对齐与混淆感知校准边界处理视觉-语言模型中的伪标签不平衡问题

专知会员服务

11+阅读 · 2025年5月6日

基于深度学习的伪装目标检测研究进展

专知会员服务

30+阅读 · 2025年4月12日

ACM Computing Surveys | 港大等基于可靠性视角的深度伪造检测综述，覆盖主流基准库、模型

专知会员服务

17+阅读 · 2025年1月12日