Data Augmentation (DA) has become an essential tool to improve robustness and generalization of modern machine learning. However, when deciding on DA strategies it is critical to choose parameters carefully, and this can be a daunting task which is traditionally left to trial-and-error or expensive optimization based on validation performance. In this paper, we counter these limitations by proposing a novel framework for optimizing DA. In particular, we take a probabilistic view of DA, which leads to the interpretation of augmentation parameters as model (hyper)-parameters, and the optimization of the marginal likelihood with respect to these parameters as a Bayesian model selection problem. Due to its intractability, we derive a tractable ELBO, which allows us to optimize augmentation parameters jointly with model parameters. We provide extensive theoretical results on variational approximation quality, generalization guarantees, invariance properties, and connections to empirical Bayes. Through experiments on computer vision and NLP tasks, we show that our approach improves calibration and yields robust performance over fixed or no augmentation. Our work provides a rigorous foundation for optimizing DA through Bayesian principles with significant potential for robust machine learning.
翻译:数据增强已成为提升现代机器学习鲁棒性与泛化能力的关键工具。然而,在确定数据增强策略时,参数选择需格外审慎,这一任务传统上依赖于试错法或基于验证性能的高成本优化,往往令人望而却步。本文针对这些局限提出了一种新颖的数据增强优化框架。具体而言,我们采用数据增强的概率视角,将增强参数诠释为模型(超)参数,并将对这些参数的边缘似然优化构建为一个贝叶斯模型选择问题。针对该问题的难解性,我们推导出一个可处理的证据下界,从而实现对增强参数与模型参数的联合优化。我们提供了关于变分近似质量、泛化保证、不变性特性以及与经验贝叶斯关联的广泛理论结果。通过在计算机视觉和自然语言处理任务上的实验,我们证明该方法能提升模型校准度,并在固定增强或无增强条件下获得更稳健的性能。本研究为通过贝叶斯原理优化数据增强奠定了严格的理论基础,对鲁棒机器学习具有重要潜力。