Feature visualization has gained substantial popularity, particularly after the influential work by Olah et al. in 2017, which established it as a crucial tool for explainability. However, its widespread adoption has been limited due to a reliance on tricks to generate interpretable images, and corresponding challenges in scaling it to deeper neural networks. Here, we describe MACO, a simple approach to address these shortcomings. The main idea is to generate images by optimizing the phase spectrum while keeping the magnitude constant to ensure that generated explanations lie in the space of natural images. Our approach yields significantly better results (both qualitatively and quantitatively) and unlocks efficient and interpretable feature visualizations for large state-of-the-art neural networks. We also show that our approach exhibits an attribution mechanism allowing us to augment feature visualizations with spatial importance. We validate our method on a novel benchmark for comparing feature visualization methods, and release its visualizations for all classes of the ImageNet dataset on https://serre-lab.github.io/Lens/. Overall, our approach unlocks, for the first time, feature visualizations for large, state-of-the-art deep neural networks without resorting to any parametric prior image model.
翻译:特征可视化已获得广泛关注,尤其是自2017年Olah等人开创性工作以来,该方法已成为可解释性的重要工具。然而,由于依赖技巧生成可解释图像,且难以扩展到更深层神经网络,其广泛应用受到限制。本文提出MACO这一简洁方法以解决上述不足。其核心思想是通过优化相位频谱同时保持幅度恒定,确保生成的解释性图像位于自然图像空间中。该方法在定性和定量层面均取得显著更优结果,并实现了大型先进神经网络的高效可解释特征可视化。我们还证明该方法具有归因机制,可为特征可视化附加空间重要性信息。我们在新建立的特征可视化方法比较基准上验证算法,并发布ImageNet数据集所有类别的可视化结果(参见https://serre-lab.github.io/Lens/)。总体而言,本方法首次在不使用任何参数化先验图像模型的情况下,实现了大型先进深度神经网络的特征可视化。