Quantifying the difference between probability distributions is crucial in machine learning. However, estimating statistical divergences from empirical samples is challenging due to unknown underlying distributions. This work proposes the representation Jensen-Shannon divergence (RJSD), a novel measure inspired by the traditional Jensen-Shannon divergence. Our approach embeds data into a reproducing kernel Hilbert space (RKHS), representing distributions through uncentered covariance operators. We then compute the Jensen-Shannon divergence between these operators, thereby establishing a proper divergence measure between probability distributions in the input space. We provide estimators based on kernel matrices and empirical covariance matrices using Fourier features. Theoretical analysis reveals that RJSD is a lower bound on the Jensen-Shannon divergence, enabling variational estimation. Additionally, we show that RJSD is a higher-order extension of the maximum mean discrepancy (MMD), providing a more sensitive measure of distributional differences. Our experimental results demonstrate RJSD's superiority in two-sample testing, distribution shift detection, and unsupervised domain adaptation, outperforming state-of-the-art techniques. RJSD's versatility and effectiveness make it a promising tool for machine learning research and applications.
翻译:量化概率分布之间的差异在机器学习中至关重要。然而,由于底层分布未知,从经验样本估计统计散度具有挑战性。本文提出表示性詹森-香农散度(RJSD),这是一种受传统詹森-香农散度启发的新型度量方法。我们的方法将数据嵌入到再生核希尔伯特空间(RKHS)中,通过无中心化协方差算子表示分布。随后,我们计算这些算子之间的詹森-香农散度,从而在输入空间中建立概率分布间的严格散度度量。我们提供了基于核矩阵和使用傅里叶特征的经验协方差矩阵的估计量。理论分析表明,RJSD是詹森-香农散度的下界,支持变分估计。此外,我们证明RJSD是最大均值差异(MMD)的高阶扩展,能提供更敏感的分布差异度量。实验结果表明,RJSD在两样本检验、分布偏移检测和无监督域适应任务中均优于现有先进技术。RJSD的多功能性和有效性使其成为机器学习研究和应用领域极具前景的工具。