This paper studies distribution-free inference in settings where the data set has a hierarchical structure -- for example, groups of observations, or repeated measurements. In such settings, standard notions of exchangeability may not hold. To address this challenge, a hierarchical form of exchangeability is derived, facilitating extensions of distribution-free methods, including conformal prediction and jackknife+. While the standard theoretical guarantee obtained by the conformal prediction framework is a marginal predictive coverage guarantee, in the special case of independent repeated measurements, it is possible to achieve a stronger form of coverage -- the "second-moment coverage" property -- to provide better control of conditional miscoverage rates, and distribution-free prediction sets that achieve this property are constructed. Simulations illustrate that this guarantee indeed leads to uniformly small conditional miscoverage rates. Empirically, this stronger guarantee comes at the cost of a larger width of the prediction set in scenarios where the fitted model is poorly calibrated, but this cost is very mild in cases where the fitted model is accurate.
翻译:本文研究数据具有分层结构(例如分组观测或重复测量)时的无分布推断问题。在此类场景中,标准的可交换性假设可能不成立。为应对这一挑战,本文推导出一种分层形式的可交换性,从而扩展了无分布方法(包括共形预测和jackknife+)的应用范围。尽管共形预测框架的标准理论保证是边际预测覆盖保证,但在独立重复测量的特殊情形下,可以达成更强的覆盖形式——"二阶矩覆盖"属性——以更好地控制条件误覆盖概率,并构造出满足该属性的无分布预测集。仿真结果表明,该保证确实能实现一致较小的条件误覆盖概率。实验显示,当拟合模型校准不佳时,这种更强的保证以预测集宽度增大为代价,但在模型准确的情形下这一代价非常轻微。