We propose a novel data augmentation method `GridMask' in this paper. It utilizes information removal to achieve state-of-the-art results in a variety of computer vision tasks. We analyze the requirement of information dropping. Then we show limitation of existing information dropping algorithms and propose our structured method, which is simple and yet very effective. It is based on the deletion of regions of the input image. Our extensive experiments show that our method outperforms the latest AutoAugment, which is way more computationally expensive due to the use of reinforcement learning to find the best policies. On the ImageNet dataset for recognition, COCO2017 object detection, and on Cityscapes dataset for semantic segmentation, our method all notably improves performance over baselines. The extensive experiments manifest the effectiveness and generality of the new method.
翻译:本文提出了一种新颖的数据增强方法"GridMask"。该方法通过信息移除技术,在多种计算机视觉任务中取得了顶尖的性能表现。我们首先分析了信息丢弃技术的核心需求,随后指出现有信息丢弃算法的局限性,并提出了基于输入图像区域删除的结构化方法,该方法简洁高效。大量实验表明,我们的方法优于最新的AutoAugment方法——后者因采用强化学习搜索最优策略而计算成本显著更高。在ImageNet图像识别、COCO2017目标检测和Cityscapes语义分割任务中,该方法均显著提升了基线的性能指标。广泛实验验证了该方法的有效性与通用性。