Effective sample size is a standard summary of Markov chain Monte Carlo output, but it is usually attached to scalar or Euclidean summaries chosen by the analyst. For manifold-valued samples this choice is not canonical: coordinate-wise effective sample sizes can change under rotations, chart changes, or alternative embeddings of the same underlying path. We propose an intrinsic effective sample size based on kernel discrepancy. The proposed quantity is the number of independent draws that would yield the same expected squared kernel discrepancy between the empirical distribution and the target distribution. This gives an exact finite-sample risk interpretation, an asymptotic integrated-autocorrelation representation, and a coordinate-free diagnostic whenever the kernel respects the geometry of the state space. We establish invariance under transported kernels, operator and principal-direction interpretations, and consistency of a lag-window estimator under boundedness and absolute-regularity conditions. We also discuss valid kernel constructions on manifolds, emphasizing that geodesic Gaussian kernels are not generally positive definite on curved spaces. Sphere experiments illustrate rotation invariance and calibration of the proposed diagnostic against empirical distributional error.
翻译:有效样本量是马尔可夫链蒙特卡洛输出的标准汇总统计量,但其通常只能表述分析者选择的标量或欧几里得空间汇总量。对于流形值样本而言,这种选择缺乏规范性:坐标方向的有效样本量会因旋转、图变换或同一底层路径的不同嵌入方式而发生改变。我们提出一种基于核差异的内在有效样本量。该指标定义为:在经验分布与目标分布之间,使得预期平方核差异相等的独立抽取次数。这一方法具有精确的有限样本风险解释、渐近积分自相关表示,以及当核函数尊重状态空间几何结构时的无坐标诊断特性。我们建立了迁移核的变换不变性,算子与主方向解释,以及在有界性与绝对正则性条件下滞后窗估计量的一致性。同时讨论了流形上有效核函数的构造,特别指出测地高斯核在弯曲空间上通常非正定。球面实验验证了该诊断量在旋转不变性方面表现,并展示了其与经验分布误差的校准关系。