DISCOVER: Making Vision Networks Interpretable via Competition and Dissection

Modern deep networks are highly complex and their inferential outcome very hard to interpret. This is a serious obstacle to their transparent deployment in safety-critical or bias-aware applications. This work contributes to post-hoc interpretability, and specifically Network Dissection. Our goal is to present a framework that makes it easier to discover the individual functionality of each neuron in a network trained on a vision task; discovery is performed in terms of textual description generation. To achieve this objective, we leverage: (i) recent advances in multimodal vision-text models and (ii) network layers founded upon the novel concept of stochastic local competition between linear units. In this setting, only a small subset of layer neurons are activated for a given input, leading to extremely high activation sparsity (as low as only $\approx 4\%$). Crucially, our proposed method infers (sparse) neuron activation patterns that enables the neurons to activate/specialize to inputs with specific characteristics, diversifying their individual functionality. This capacity of our method supercharges the potential of dissection processes: human understandable descriptions are generated only for the very few active neurons, thus facilitating the direct investigation of the network's decision process. As we experimentally show, our approach: (i) yields Vision Networks that retain or improve classification performance, and (ii) realizes a principled framework for text-based description and examination of the generated neuronal representations.

翻译：现代深度网络高度复杂，其推理结果极难解释。这是其在安全关键或偏见感知应用中透明部署的严重障碍。本文致力于事后可解释性，特别是网络剖析。我们的目标是提出一个框架，使得发现视觉任务训练网络中每个神经元个体功能更加容易；这种发现通过文本描述生成来实现。为实现此目标，我们利用：（i）多模态视觉-文本模型的最新进展，以及（ii）基于线性单元之间随机局部竞争这一新颖概念的网络层。在这种设定下，对于给定输入，仅有一小部分层神经元被激活，导致极高的激活稀疏性（低至约$\approx 4\%$）。关键在于，我们提出的方法推断出（稀疏的）神经元激活模式，使神经元能够激活/专门化于具有特定特征的输入，从而多样化其个体功能。我们方法的这一能力极大增强了剖析过程的潜力：仅为极少数活跃神经元生成人类可理解的描述，从而便于直接探究网络的决策过程。通过实验证明，我们的方法：（i）能够产生保持或提升分类性能的视觉网络，以及（ii）实现了一个基于文本描述与检查所生成神经元表征的原则性框架。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日