Recent literature has shown that features obtained from supervised training of CNNs may over-emphasize texture rather than encoding high-level information. In self-supervised learning in particular, texture as a low-level cue may provide shortcuts that prevent the network from learning higher level representations. To address these problems we propose to use classic methods based on anisotropic diffusion to augment training using images with suppressed texture. This simple method helps retain important edge information and suppress texture at the same time. We empirically show that our method achieves state-of-the-art results on object detection and image classification with eight diverse datasets in either supervised or self-supervised learning tasks such as MoCoV2 and Jigsaw. Our method is particularly effective for transfer learning tasks and we observed improved performance on five standard transfer learning datasets. The large improvements (up to 11.49\%) on the Sketch-ImageNet dataset, DTD dataset and additional visual analyses with saliency maps suggest that our approach helps in learning better representations that better transfer.
翻译:近期文献表明,通过卷积神经网络监督训练获得的特征可能过度强调纹理而忽视高层语义信息的编码。特别是在自监督学习中,作为低级线索的纹理可能提供捷径,阻碍网络学习更高级的表示。为解决这些问题,我们提出采用基于各向异性扩散的经典方法,通过抑制纹理的图像来增强训练。这种简单方法能够同时保留关键边缘信息并抑制纹理。实验表明,我们的方法在八种不同数据集的物体检测与图像分类任务中,无论是监督学习还是自监督学习任务(如MoCoV2和Jigsaw)均取得了最优结果。该方法对迁移学习任务尤为有效,我们在五个标准迁移学习数据集上观察到性能提升。在Sketch-ImageNet数据集、DTD数据集上的显著改进(最高达11.49%)以及基于显著性图的额外视觉分析表明,我们的方法有助于学习更具迁移性的优质表征。