In this paper, we prove the representation defects of a cascaded convolutional decoder network, considering the capacity of representing different frequency components of an input sample. We conduct the discrete Fourier transform on each channel of the feature map in an intermediate layer of the decoder network. Then, we extend the 2D circular convolution theorem to represent the forward and backward propagations through convolutional layers in the frequency domain. Based on this, we prove three defects in representing feature spectrums. First, we prove that the convolution operation, the zero-padding operation, and a set of other settings all make a convolutional decoder network more likely to weaken high-frequency components. Second, we prove that the upsampling operation generates a feature spectrum, in which strong signals repetitively appear at certain frequencies. Third, we prove that if the frequency components in the input sample and frequency components in the target output for regression have a small shift, then the decoder usually cannot be effectively learned.
翻译:本文证明了级联卷积解码网络在表示输入样本不同频率分量能力上的表示缺陷。我们对解码网络中间层特征图的每个通道进行离散傅里叶变换,并扩展二维循环卷积定理,在频域中表征卷积层的前向与反向传播。基于此,我们证明了特征频谱表示中的三个缺陷:第一,卷积操作、零填充操作及其他若干设置均会使卷积解码网络更倾向于削弱高频分量;第二,上采样操作生成的特征频谱中,强信号会在特定频率上重复出现;第三,若输入样本的频率分量与回归目标输出的频率分量存在微小偏移,则解码器通常难以有效学习。