Dropout as a regularization technique is widely used in fully connected layers while is less effective in convolutional layers. Therefore more structured forms of dropout have been proposed to regularize convolutional networks. The disadvantage of these methods is that the randomness introduced causes inconsistency between training and inference. In this paper, we apply a mutual learning training strategy for convolutional layer regularization, namely R-Block, which forces two outputs of the generated difference maximizing sub models to be consistent with each other. Concretely, R-Block minimizes the losses between the output distributions of two sub models with different drop regions for each sample in the training dataset. We design two approaches to construct such sub models. Our experiments demonstrate that R-Block achieves better performance than other existing structured dropout variants. We also demonstrate that our approaches to construct sub models outperforms others.
翻译:Dropout作为一种正则化技术在全连接层中广泛应用,但在卷积层中效果较差。为此,研究者提出了更多结构化的丢弃方法以正则化卷积网络。这些方法的缺点在于引入的随机性会导致训练与推理阶段的不一致性。本文针对卷积层正则化问题,提出了一种互学习训练策略——R-Block,该策略强制两个由差异最大化子模型生成的输出保持相互一致。具体而言,R-Block通过最小化训练数据集中每个样本在不同丢弃区域的两个子模型输出分布之间的损失来实现。我们设计了两种构建此类子模型的方法。实验表明,R-Block在性能上优于其他现有的结构化丢弃变体。同时,我们的子模型构建方法也展现出更优的效果。