The Cauchy-Schwarz (CS) divergence was developed by Pr\'{i}ncipe et al. in 2000. In this paper, we extend the classic CS divergence to quantify the closeness between two conditional distributions and show that the developed conditional CS divergence can be simply estimated by a kernel density estimator from given samples. We illustrate the advantages (e.g., rigorous faithfulness guarantee, lower computational complexity, higher statistical power, and much more flexibility in a wide range of applications) of our conditional CS divergence over previous proposals, such as the conditional KL divergence and the conditional maximum mean discrepancy. We also demonstrate the compelling performance of conditional CS divergence in two machine learning tasks related to time series data and sequential inference, namely time series clustering and uncertainty-guided exploration for sequential decision making.
翻译:柯西-施瓦茨(CS)散度由Príncipe等人于2000年提出。本文扩展了经典CS散度,用于量化两个条件分布之间的接近程度,并证明所提出的条件CS散度可通过给定样本的核密度估计器简便计算。我们展示了条件CS散度相较于条件KL散度和条件最大均值差异等先前方法的优势(例如严格的保真度保证、更低的计算复杂度、更高的统计功效以及在广泛应用场景中更强的灵活性)。此外,我们还论证了条件CS散度在两类涉及时间序列数据与序贯推理的机器学习任务中的卓越性能,即时间序列聚类与面向序贯决策的不确定性引导探索。