This paper explores methods for estimating or approximating the total variation distance and the chi-squared divergence of probability measures within topological sample spaces, using independent and identically distributed samples. Our focus is on the practical scenario where the sample space is homeomorphic to subsets of Euclidean space, with the specific homeomorphism remaining unknown. Our proposed methods rely on the integral probability metric with witness functions in universal reproducing kernel Hilbert spaces (RKHSs). The estimators we develop consist of learnable parametric functions mapping the sample space to Euclidean space, paired with universal kernels defined in Euclidean space. This approach effectively overcomes the challenge of constructing universal kernels directly on non-Euclidean spaces. Furthermore, the estimators we devise demonstrate asymptotic consistency, and we provide a detailed statistical analysis, shedding light on their practical implementation.
翻译:本文探讨了在拓扑样本空间中,利用独立同分布样本估计或逼近概率测度之间的全变差距离与卡方散度的方法。我们聚焦于样本空间与欧几里得空间子集同胚(但该同胚映射未知)的实际场景。所提方法基于通用再生核希尔伯特空间(RKHS)中见证函数的积分概率度量。我们构建的估计量由可学习的参数化函数(将样本空间映射至欧几里得空间)与定义在欧几里得空间上的通用核函数共同组成。这一方案有效克服了在非欧几里得空间直接构造通用核函数的难题。此外,我们设计的估计量具备渐近一致性,并通过详细的统计分析揭示了其实践应用中的关键特性。