Convolutional neural networks (CNNs) have achieved superior performance but still lack clarity about the nature and properties of feature extraction. In this paper, by analyzing the sensitivity of neural networks to frequencies and scales, we find that neural networks not only have low- and medium-frequency biases but also prefer different frequency bands for different classes, and the scale of objects influences the preferred frequency bands. These observations lead to the hypothesis that neural networks must learn the ability to extract features at various scales and frequencies. To corroborate this hypothesis, we propose a network architecture based on Gaussian derivatives, which extracts features by constructing scale space and employing partial derivatives as local feature extraction operators to separate high-frequency information. This manually designed method of extracting features from different scales allows our GSSDNets to achieve comparable accuracy with vanilla networks on various datasets.
翻译:卷积神经网络(CNN)虽已取得卓越性能,但其特征提取的本质与特性仍缺乏清晰阐释。本文通过分析神经网络对频率和尺度的敏感性,发现神经网络不仅存在低频与中频偏好,且对不同类别倾向于不同频段,而物体尺度会影响其偏好频段。这些观察引出一个假设:神经网络必须学习在多种尺度和频率下提取特征的能力。为验证该假设,我们提出一种基于高斯导数的网络架构,该架构通过构建尺度空间并采用偏导数作为局部特征提取算子来分离高频信息。这种从不同尺度手动设计特征提取方法使我们的GSSDNets在多种数据集上能够达到与标准网络相当的准确率。