The comparison of frequency distributions is a common statistical task with broad applications. However, existing measures do not explicitly quantify the magnitude and direction by which one distribution is shifted relative to another. In the present study, we define distributional shift (DS) as the concentration of frequencies towards the lowest discrete class, e.g., the left-most bin of a histogram. We measure DS via the sum of cumulative frequencies and define relative distributional shift (RDS) as the difference in DS between distributions. Using simulated random sampling, we show that RDS is highly related to measures that are widely used to compare frequency distributions. Focusing on specific applications, we show that DS and RDS provide insights into healthcare billing distributions, ecological species-abundance distributions, and economic distributions of wealth. RDS has the unique advantage of being a signed (i.e., directional) measure based on a simple difference in an intuitive property that, in turn, serves as a measure of rarity, poverty, and scarcity.
翻译:频率分布的比较是一项具有广泛应用场景的常见统计任务。然而,现有测量方法无法明确量化一个分布相对于另一个分布的偏移幅度和方向。本研究将分布偏移定义为频率向最低离散类别(例如直方图最左侧分箱)的集中程度。我们通过累积频率之和来测量分布偏移,并将相对分布偏移定义为不同分布之间分布偏移的差值。通过模拟随机抽样,我们发现相对分布偏移与广泛用于比较频率分布的测量指标高度相关。聚焦特定应用场景,我们证明分布偏移与相对分布偏移能够为医疗计费分布、生态物种丰度分布以及经济财富分布提供深刻洞察。相对分布偏移的独特优势在于,它是一种基于直观属性简单差值的带符号(即方向性)测量指标,而该直观属性本身即可作为稀有性、贫困性和稀缺性的度量。