We consider the concentration properties of functions of infinitely exchangeable random variables. By conditioning on the de Finetti directing measure, we show that the deviation of any function with bounded-difference constants $c_1, \dots, c_n$ decomposes into a conditional sampling fluctuation and a latent mixture fluctuation. When this latent mixture is $σ_{\mathrm{mix}}^2$-subgaussian, we establish a concentration inequality with an effective variance proxy of $\frac{1}{4}\sum_i c_i^2 + σ_{\mathrm{mix}}^2$. Crucially, we demonstrate that for zero-sum linear contrasts, such as the difference between a subsample mean and a full population mean, the latent mixture term cancels exactly. This cancellation yields a tight, mixture-free Hoeffding-type bound that provides a direct de Finetti mechanism for the infinite-extendibility limit of recent finite-exchangeable concentration results. We apply this framework to quantify uncertainty in composite AI benchmarks, such as MMLU, where question items naturally exhibit exchangeable dependence across domains. Our results provide both a domain-stratified hierarchical model for bounding the uncertainty of accuracy scores, and a distribution-free, cost-saving statistical guarantee for accurately estimating full benchmark scores from random subsets.
翻译:我们研究了无穷可交换随机变量函数的浓度性质。通过以de Finetti指导测度为条件,我们证明了任何具有有界差分常数$c_1, \dots, c_n$的函数的偏差可以分解为条件采样波动和潜在混合波动两部分。当该潜在混合是$σ_{\mathrm{mix}}^2$-次高斯分布时,我们建立了一个有效方差代理为$\frac{1}{4}\sum_i c_i^2 + σ_{\mathrm{mix}}^2$的浓度不等式。关键的是,我们证明了对于零和线性对比(如子样本均值与总体均值之差),潜在混合项会完全抵消。这种抵消产生了一个紧致的无混合Hoeffding型界,该界为近期有限可交换浓度结果的无穷可扩展极限提供了直接的de Finetti机制。我们将该框架应用于量化复合AI基准(如MMLU)中的不确定性,其中问题项目自然地在不同领域间表现出可交换依赖性。我们的结果既提供了用于限定准确率分数不确定性的领域分层层次模型,也提供了从随机子集准确估计完整基准分数的无分布、节约成本的统计保证。