Divergence measures play a central role in machine learning and become increasingly essential in deep learning. However, valid and computationally efficient divergence measures for multiple (more than two) distributions are scarcely investigated. This becomes particularly crucial in areas where the simultaneous management of multiple distributions is both unavoidable and essential. Examples include clustering, multi-source domain adaptation or generalization, and multi-view learning, among others. Although calculating the mean of pairwise distances between any two distributions serves as a common way to quantify the total divergence among multiple distributions, it is crucial to acknowledge that this approach is not straightforward and requires significant computational resources. In this study, we introduce a new divergence measure for multiple distributions named the generalized Cauchy-Schwarz divergence (GCSD), which is inspired by the classic Cauchy-Schwarz divergence. Additionally, we provide a closed-form sample estimator based on kernel density estimation, making it convenient and straightforward to use in various machine-learning applications. Finally, we apply the proposed GCSD to two challenging machine learning tasks, namely deep learning-based clustering and the problem of multi-source domain adaptation. The experimental results showcase the impressive performance of GCSD in both tasks, highlighting its potential application in machine-learning areas that involve quantifying multiple distributions.
翻译:散度度量在机器学习中扮演核心角色,并在深度学习中日益重要。然而,针对多个(两个以上)分布的合法且计算高效的散度度量鲜有研究。这在需要同时处理多个分布且无法回避的领域尤为关键,例如聚类、多源域适应或泛化以及多视角学习等。尽管计算任意两个分布间成对距离的均值可作为量化多个分布总体散度的通用方法,但必须认识到这一方法并不直接,且需要大量计算资源。本研究提出一种新型多分布散度度量——广义柯西-施瓦茨散度(Generalized Cauchy-Schwarz Divergence, GCSD),其灵感源于经典柯西-施瓦茨散度。此外,我们基于核密度估计给出了闭式样本估计量,使其可便捷地应用于各类机器学习任务。最后,我们将所提出的GCSD应用于两项具有挑战性的机器学习任务:基于深度学习的聚类与多源域适应问题。实验结果表明GCSD在这两项任务中均展现出卓越性能,凸显了其在涉及多分布量化的机器学习领域中的潜在应用价值。