This paper explores in-context learning for image copy detection (ICD), i.e., prompting an ICD model to identify replicated images with new tampering patterns without the need for additional training. The prompts (or the contexts) are from a small set of image-replica pairs that reflect the new patterns and are used at inference time. Such in-context ICD has good realistic value, because it requires no fine-tuning and thus facilitates fast reaction against the emergence of unseen patterns. To accommodate the "seen $\rightarrow$ unseen" generalization scenario, we construct the first large-scale pattern dataset named AnyPattern, which has the largest number of tamper patterns ($90$ for training and $10$ for testing) among all the existing ones. We benchmark AnyPattern with popular ICD methods and reveal that existing methods barely generalize to novel tamper patterns. We further propose a simple in-context ICD method named ImageStacker. ImageStacker learns to select the most representative image-replica pairs and employs them as the pattern prompts in a stacking manner (rather than the popular concatenation manner). Experimental results show (1) training with our large-scale dataset substantially benefits pattern generalization ($+26.66 \%$ $\mu AP$), (2) the proposed ImageStacker facilitates effective in-context ICD (another round of $+16.75 \%$ $\mu AP$), and (3) AnyPattern enables in-context ICD, i.e. without such a large-scale dataset, in-context learning does not emerge even with our ImageStacker. The project (including the proposed dataset AnyPattern and the code for ImageStacker) is publicly available at https://anypattern.github.io under the MIT Licence.
翻译:本文探讨了图像复制检测(ICD)中的上下文学习,即通过提示ICD模型识别带有新篡改模式的复制图像而无需额外训练。提示(或上下文)来自一组反映新模式的图像-副本对,并在推理时使用。这种上下文ICD具有实际应用价值,因为它无需微调,从而能够快速应对未见模式的涌现。为适应"已知→未知"的泛化场景,我们构建了首个大规模模式数据集AnyPattern,其篡改模式数量(训练集90种,测试集10种)超过现有所有数据集。我们使用主流ICD方法对AnyPattern进行基准测试,发现现有方法几乎无法泛化到新型篡改模式。进一步提出一种简单的上下文ICD方法——ImageStacker。该方法通过学习选择最具代表性的图像-副本对,并以堆叠方式(而非流行的拼接方式)将其作为模式提示。实验结果表明:(1)使用大规模数据集训练显著提升模式泛化能力($\mu AP$提升+26.66%);(2)所提出的ImageStacker能有效实现上下文ICD($\mu AP$再提升+16.75%);(3)AnyPattern赋予模型上下文ICD能力——若无此类大规模数据集,即使使用ImageStacker也无法实现上下文学习。项目(含数据集AnyPattern及ImageStacker代码)已在MIT许可下公开于https://anypattern.github.io。