Image classification has improved with the development of training techniques. However, these techniques often require careful parameter tuning to balance the strength of regularization, limiting their potential benefits. In this paper, we propose a novel way to use regularization called Augmenting Sub-model (AugSub). AugSub consists of two models: the main model and the sub-model. While the main model employs conventional training recipes, the sub-model leverages the benefit of additional regularization. AugSub achieves this by mitigating adverse effects through a relaxed loss function similar to self-distillation loss. We demonstrate the effectiveness of AugSub with three drop techniques: dropout, drop-path, and random masking. Our analysis shows that all AugSub improves performance, with the training loss converging even faster than regular training. Among the three, AugMask is identified as the most practical method due to its performance and cost efficiency. We further validate AugMask across diverse training recipes, including DeiT-III, ResNet, MAE fine-tuning, and Swin Transformer. The results show that AugMask consistently provides significant performance gain. AugSub provides a practical and effective solution for introducing additional regularization under various training recipes. Code is available at \url{https://github.com/naver-ai/augsub}.
翻译:图像分类技术随着训练方法的发展而不断进步。然而,这些方法通常需要精细的参数调优以平衡正则化强度,从而限制了其潜在收益。本文提出了一种名为"增强子模型"(AugSub)的新型正则化方法。AugSub包含两个模型:主模型和子模型。主模型采用常规训练策略,而子模型则充分利用额外正则化的优势。AugSub通过类似于自蒸馏损失的松弛损失函数来缓解正则化的负面影响。我们通过三种丢弃技术验证了AugSub的有效性:dropout、drop-path和随机掩码。分析表明,所有AugSub方法均能提升性能,且训练损失收敛速度比常规训练更快。其中,AugMask因其性能与成本效率的综合优势被确定为最实用的方法。我们进一步在多种训练策略(包括DeiT-III、ResNet、MAE微调和Swin Transformer)上验证了AugMask。结果表明,AugMask始终能带来显著的性能提升。AugSub为在各种训练策略中引入额外正则化提供了实用高效的解决方案。代码已开源至 \url{https://github.com/naver-ai/augsub}。