Automatically discovering failures in vision models under real-world settings remains an open challenge. This work demonstrates how off-the-shelf, large-scale, image-to-text and text-to-image models, trained on vast amounts of data, can be leveraged to automatically find such failures. In essence, a conditional text-to-image generative model is used to generate large amounts of synthetic, yet realistic, inputs given a ground-truth label. Misclassified inputs are clustered and a captioning model is used to describe each cluster. Each cluster's description is used in turn to generate more inputs and assess whether specific clusters induce more failures than expected. We use this pipeline to demonstrate that we can effectively interrogate classifiers trained on ImageNet to find specific failure cases and discover spurious correlations. We also show that we can scale the approach to generate adversarial datasets targeting specific classifier architectures. This work serves as a proof-of-concept demonstrating the utility of large-scale generative models to automatically discover bugs in vision models in an open-ended manner. We also describe a number of limitations and pitfalls related to this approach.
翻译:自动发现真实场景下视觉模型的故障仍然是一个开放挑战。本研究展示了如何利用基于海量数据训练的大规模现成图像转文本和文本转图像模型,自动识别此类故障。具体而言,我们使用条件文本转图像生成模型生成大量具有真实感但合成得到的输入数据,并给定其真实标签。对分类错误的输入进行聚类后,利用字幕模型描述每个聚类簇的特征。随后,通过每个聚类簇的描述生成更多输入数据,评估特定聚类是否引发超出预期的故障。我们通过该流水线证明,能够有效查询在ImageNet上训练的分类器,发现其特定故障案例与虚假相关性。同时表明该方法可扩展至生成针对特定分类器架构的对抗性数据集。本工作作为概念验证,展示了利用大规模生成模型以开放方式自动发现视觉模型漏洞的潜力。我们还讨论了该方法相关的若干局限性与潜在陷阱。