This paper advances the theory and practice of Domain Generalization (DG) in machine learning. We consider the typical DG setting where the hypothesis is composed of a representation mapping followed by a labeling function. Within this setting, the majority of popular DG methods aim to jointly learn the representation and the labeling functions by minimizing a well-known upper bound for the classification risk in the unseen domain. In practice, however, methods based on this theoretical upper bound ignore a term that cannot be directly optimized due to its dual dependence on both the representation mapping and the unknown optimal labeling function in the unseen domain. To bridge this gap between theory and practice, we introduce a new upper bound that is free of terms having such dual dependence, resulting in a fully optimizable risk upper bound for the unseen domain. Our derivation leverages classical and recent transport inequalities that link optimal transport metrics with information-theoretic measures. Compared to previous bounds, our bound introduces two new terms: (i) the Wasserstein-2 barycenter term that aligns distributions between domains, and (ii) the reconstruction loss term that assesses the quality of representation in reconstructing the original data. Based on this new upper bound, we propose a novel DG algorithm named Wasserstein Barycenter Auto-Encoder (WBAE) that simultaneously minimizes the classification loss, the barycenter loss, and the reconstruction loss. Numerical results demonstrate that the proposed method outperforms current state-of-the-art DG algorithms on several datasets.
翻译:本文推进了机器学习中域泛化(DG)的理论与实践。我们考虑典型的DG设置,其中假设由表示映射和标注函数组成。在该设置下,大多数主流DG方法通过最小化未知域中分类风险的已知上界来联合学习表示和标注函数。然而,基于该理论上界的实际方法忽略了一项无法直接优化的项,原因在于该项对表示映射和未知域中的未知最优标注函数存在双重依赖。为弥合理论与实践的差距,我们引入了一个不含此类双重依赖项的新上界,从而得到未知域完全可优化的风险上界。我们的推导利用了经典和最新的运输不等式,将最优运输度量与信息论测度联系起来。与先前的上界相比,我们的上界引入了两个新项:(i)用于对齐域间分布的Wasserstein-2重心项,以及(ii)用于评估表示重构原始数据质量的重构损失项。基于这一新上界,我们提出了一种新颖的DG算法——Wasserstein重心自编码器(WBAE),该算法同时最小化分类损失、重心损失和重构损失。数值结果表明,所提方法在多个数据集上优于当前最先进的DG算法。