Divergence measures play a central role in machine learning and become increasingly essential in deep learning. However, valid and computationally efficient divergence measures for multiple (more than two) distributions are scarcely investigated. This becomes particularly crucial in areas where the simultaneous management of multiple distributions is both unavoidable and essential. Examples include clustering, multi-source domain adaptation or generalization, and multi-view learning, among others. Although calculating the mean of pairwise distances between any two distributions serves as a common way to quantify the total divergence among multiple distributions, it is crucial to acknowledge that this approach is not straightforward and requires significant computational resources. In this study, we introduce a new divergence measure for multiple distributions named the generalized Cauchy-Schwarz divergence (GCSD), which is inspired by the classic Cauchy-Schwarz divergence. Additionally, we provide a closed-form sample estimator based on kernel density estimation, making it convenient and straightforward to use in various machine-learning applications. Finally, we apply the proposed GCSD to two challenging machine learning tasks, namely deep learning-based clustering and the problem of multi-source domain adaptation. The experimental results showcase the impressive performance of GCSD in both tasks, highlighting its potential application in machine-learning areas that involve quantifying multiple distributions.
翻译:散度度量在机器学习中发挥着核心作用,并在深度学习中日益重要。然而,针对多个(多于两个)分布的有效且计算高效的散度度量却鲜有研究。这在需要同时处理多个分布且不可避免的领域尤为关键,例如聚类、多源域适应或泛化以及多视角学习等。尽管计算任意两个分布间成对距离的平均值是量化多个分布总体散度的常用方法,但必须承认这种方法并非简单直接,且需要大量计算资源。本研究受经典柯西-施瓦茨散度的启发,提出了一种新的多分布散度度量——广义柯西-施瓦茨散度(GCSD)。此外,本文还提供了基于核密度估计的闭式样本估计器,使其在各种机器学习应用中便于使用。最后,我们将所提出的GCSD应用于两个具有挑战性的机器学习任务:基于深度学习的聚类和多源域适应问题。实验结果表明,GCSD在这两项任务中均表现出色,凸显了其在涉及量化多个分布的机器学习领域中的潜在应用价值。