We introduce Deep Augmentation, an approach to implicit data augmentation using dropout or PCA to transform a targeted layer within a neural network to improve performance and generalization. We demonstrate Deep Augmentation through extensive experiments on contrastive learning tasks in NLP, computer vision, and graph learning. We observe substantial performance gains with Transformers, ResNets, and Graph Neural Networks as the underlying models in contrastive learning, but observe inverse effects on the corresponding supervised problems. Our analysis suggests that Deep Augmentation alleviates co-adaptation between layers, a problem exhibited by self-supervised learning where ground truth labels are not available. We use this observation to formulate a method for selecting which layer to target; in particular, our experimentation reveals that targeting deeper layers with Deep Augmentation outperforms augmenting the input data. The simple network- and modality-agnostic nature of this approach enables its integration into various machine learning pipelines.
翻译:本文提出深度增强方法,这是一种通过dropout或主成分分析对神经网络中目标层进行变换的隐式数据增强技术,旨在提升模型性能与泛化能力。我们在自然语言处理、计算机视觉和图学习的对比学习任务中进行了大量实验验证。实验表明,当以Transformer、ResNet和图神经网络作为对比学习的基干模型时,该方法能带来显著的性能提升,但在对应的监督学习问题上却观察到反向效果。分析表明,深度增强能够缓解层间协同适应问题——这是自监督学习在缺乏真实标签时面临的典型问题。基于这一发现,我们提出了目标层选择策略;特别地,实验结果表明对深层进行深度增强的效果优于输入数据增强。该方法具有简单的网络无关性与模态无关性,可灵活集成到多种机器学习流程中。