Cumulative differences between paired samples

The simplest, most common paired samples consist of observations from two populations, with each observed response from one population corresponding to an observed response from the other population at the same value of an ordinal covariate. The pair of observed responses (one from each population) at the same value of the covariate is known as a "matched pair" (with the matching based on the value of the covariate). A graph of cumulative differences between the two populations reveals differences in responses as a function of the covariate. Indeed, the slope of the secant line connecting two points on the graph becomes the average difference over the wide interval of values of the covariate between the two points; i.e., slope of the graph is the average difference in responses. ("Average" refers to the weighted average if the samples are weighted.) Moreover, a simple statistic known as the Kuiper metric summarizes into a single scalar the overall differences over all values of the covariate. The Kuiper metric is the absolute value of the total difference in responses between the two populations, totaled over the interval of values of the covariate for which the absolute value of the total is greatest. The total should be normalized such that it becomes the (weighted) average over all values of the covariate when the interval over which the total is taken is the entire range of the covariate (i.e., the sum for the total gets divided by the total number of observations, if the samples are unweighted, or divided by the total weight, if the samples are weighted). This cumulative approach is fully nonparametric and uniquely defined (with only one right way to construct the graphs and scalar summary statistics), unlike traditional methods such as reliability diagrams or parametric or semi-parametric regressions, which typically obscure significant differences due to their parameter settings.

翻译：最简单的、最常见的配对样本由来自两个总体的观测值组成，其中来自一个总体的每个观测响应，在有序协变量的相同取值下，对应于来自另一个总体的一个观测响应。在协变量相同取值处的一对观测响应（每个总体各一个）被称为"匹配对"（基于协变量值进行匹配）。两个总体之间累积差异的图形揭示了响应随协变量变化的差异。实际上，连接图形上两点的割线斜率成为这两点之间协变量值宽区间上的平均差异；即图形的斜率是响应的平均差异。（如果样本是加权的，"平均"指的是加权平均。）此外，一个称为库珀度量的简单统计量将协变量所有取值上的总体差异归纳为单一标量。库珀度量是两个总体之间响应总差异的绝对值，该总差异在协变量取值区间上汇总，且该区间上该绝对值的总和最大。该总和应进行归一化，使其当取总和的区间覆盖协变量的整个范围时，成为协变量所有取值上的（加权）平均（即如果样本未加权，则总和除以观测总数；如果样本加权，则除以总权重）。这种累积方法完全是非参数的且唯一确定（只有一种正确方式构建图形和标量汇总统计量），不同于可靠性图或参数/半参数回归等传统方法，这些方法通常因其参数设置而掩盖显著差异。