Stride determines the distance between adjacent filter positions as the filter moves across the input. A fixed stride causes important information contained in the image can not be captured, so that important information is not classified. Therefore, in previous research, the DiffStride Method was applied, namely the Strided Convolution Method with which it can learn its own stride value. Severe Quantization and a constraining lower bound on preserved information are arises with Max Pooling Downsampling Method. Spectral Pooling reduce the constraint lower bound on preserved information by cutting off the representation in the frequency domain. In this research a CNN Model is proposed with the Downsampling Learnable Stride Technique performed by Backpropagation combined with the Spectral Pooling Technique. Diffstride and Spectral Pooling techniques are expected to maintain most of the information contained in the image. In this study, we compare the Hybrid Method, which is a combined implementation of Spectral Pooling and DiffStride against the Baseline Method, which is the DiffStride implementation on ResNet 18. The accuracy result of the DiffStride combination with Spectral Pooling improves over DiffStride which is baseline method by 0.0094. This shows that the Hybrid Method can maintain most of the information by cutting of the representation in the frequency domain and determine the stride of the learning result through Backpropagation.
翻译:步长决定了滤波器在输入上移动时相邻滤波器位置之间的距离。固定的步长可能导致无法捕获图像中包含的重要信息,从而使重要信息无法被分类。因此,在先前的研究中,应用了DiffStride方法,即能够自主学习步长值的步进卷积方法。最大池化下采样方法会产生严重的量化问题,并对保留信息施加了严格的下界约束。频谱池化方法通过截断频域表示来降低保留信息的下界约束。本研究提出了一种CNN模型,该模型采用通过反向传播学习步长的下采样可学习步长技术,并结合频谱池化技术。DiffStride与频谱池化技术有望保留图像中的大部分信息。本研究将混合方法(即频谱池化与DiffStride的联合实现)与基线方法(即在ResNet 18上实现的DiffStride)进行了对比。结果表明,DiffStride与频谱池化结合的准确率相比基线方法(DiffStride)提升了0.0094。这表明混合方法能够通过在频域截断表示来保留大部分信息,并通过反向传播确定学习得到的步长值。