In recent years, a huge amount of deep neural architectures have been developed for image classification. It remains curious whether these models are similar or different and what factors contribute to their similarities or differences. To address this question, we aim to design a quantitative and scalable similarity function between neural architectures. We utilize adversarial attack transferability, which has information related to input gradients and decision boundaries that are widely used to understand model behaviors. We conduct a large-scale analysis on 69 state-of-the-art ImageNet classifiers using our proposed similarity function to answer the question. Moreover, we observe neural architecture-related phenomena using model similarity that model diversity can lead to better performance on model ensembles and knowledge distillation under specific conditions. Our results provide insights into why the development of diverse neural architectures with distinct components is necessary.
翻译:近年来,研究人员为图像分类任务开发了大量深度神经架构。这些模型究竟是相似还是不同,以及哪些因素导致了它们的相似性或差异性,仍是值得探究的问题。为解决这一问题,我们旨在设计一种可量化、可扩展的神经架构相似性函数。我们利用对抗攻击的可迁移性来实现这一目标,该技术涉及输入梯度和决策边界信息,这些信息被广泛用于理解模型行为。基于所提出的相似性函数,我们对69个最先进的ImageNet分类器进行了大规模分析以解答上述问题。此外,我们通过模型相似性观察到与神经架构相关的现象:在特定条件下,模型多样性能够提升模型集成和知识蒸馏的性能。研究结果为开发具有不同组件的多样化神经架构的必要性提供了理论依据。