What could go wrong? Discovering and describing failure modes in computer vision

Deep learning models are effective, yet brittle. Even carefully trained, their behavior tends to be hard to predict when confronted with out-of-distribution samples. In this work, our goal is to propose a simple yet effective solution to predict and describe via natural language potential failure modes of computer vision models. Given a pretrained model and a set of samples, our aim is to find sentences that accurately describe the visual conditions in which the model underperforms. In order to study this important topic and foster future research on it, we formalize the problem of Language-Based Error Explainability (LBEE) and propose a set of metrics to evaluate and compare different methods for this task. We propose solutions that operate in a joint vision-and-language embedding space, and can characterize through language descriptions model failures caused, e.g., by objects unseen during training or adverse visual conditions. We experiment with different tasks, such as classification under the presence of dataset bias and semantic segmentation in unseen environments, and show that the proposed methodology isolates nontrivial sentences associated with specific error causes. We hope our work will help practitioners better understand the behavior of models, increasing their overall safety and interpretability.

翻译：深度学习模型虽然高效，却十分脆弱。即便经过精心训练，当面对分布外样本时，其行为仍难以预测。本工作的目标是提出一种简单而有效的解决方案，通过自然语言预测并描述计算机视觉模型潜在的故障模式。给定一个预训练模型和一组样本，我们的目标是找到能准确描述模型性能不佳时视觉条件的语句。为研究这一重要课题并推动未来相关研究，我们形式化了基于语言的错误可解释性（LBEE）问题，并提出一套评估和比较该任务不同方法的指标。我们提出的解决方案在视觉-语言联合嵌入空间中运行，能够通过语言描述来刻画模型故障，例如由训练期间未见过的物体或不利视觉条件引起的故障。我们在不同任务上进行了实验，包括存在数据集偏差情况下的分类以及未知环境中的语义分割，结果表明所提方法能够分离出与特定错误原因相关的非平凡语句。我们希望这项工作能帮助从业者更好地理解模型行为，从而提升其整体安全性和可解释性。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日