Overfitting to the source domain is a common issue in gradient-based training of deep neural networks. To compensate for the over-parameterized models, numerous regularization techniques have been introduced such as those based on dropout. While these methods achieve significant improvements on classical benchmarks such as ImageNet, their performance diminishes with the introduction of domain shift in the test set i.e. when the unseen data comes from a significantly different distribution. In this paper, we move away from the classical approach of Bernoulli sampled dropout mask construction and propose to base the selection on gradient-signal-to-noise ratio (GSNR) of network's parameters. Specifically, at each training step, parameters with high GSNR will be discarded. Furthermore, we alleviate the burden of manually searching for the optimal dropout ratio by leveraging a meta-learning approach. We evaluate our method on standard domain generalization benchmarks and achieve competitive results on classification and face anti-spoofing problems.
翻译:梯度驱动的深度神经网络训练中,过拟合源域是常见问题。为补偿过度参数化模型,研究者提出了多种正则化技术(如基于Dropout的方法)。尽管这些方法在ImageNet等经典基准测试中取得显著进步,但当测试集存在领域偏移(即未见数据来自显著不同的分布)时,其性能会下降。本文突破传统伯努利采样掩码构建方式,提出基于网络参数梯度信噪比(GSNR)进行选择:在每个训练步骤中,丢弃具有高GSNR的参数。此外,我们通过元学习方法减轻了人工搜索最优Dropout比率的负担。在标准领域泛化基准测试上的评估表明,本方法在分类和人脸反欺骗任务中均取得了具有竞争力的结果。