Detecting weak, systematic distribution shifts and quantitatively modeling individual, heterogeneous responses to policies or incentives have found increasing empirical applications in social and economic sciences. Given two probability distributions $P$ (null) and $Q$ (alternative), we study the problem of detecting weak distribution shift deviating from the null $P$ toward the alternative $Q$, where the level of deviation vanishes as a function of $n$, the sample size. We propose a model for weak distribution shifts via displacement interpolation between $P$ and $Q$, drawing from the optimal transport theory. We study a hypothesis testing procedure based on the Wasserstein distance, derive sharp conditions under which detection is possible, and provide the exact characterization of the asymptotic Type I and Type II errors at the detection boundary using empirical processes. We demonstrate how the proposed testing procedure works in modeling and detecting weak distribution shifts in real data sets using two empirical examples: distribution shifts in consumer spending after COVID-19, and heterogeneity in the published p-values of statistical tests in journals across different disciplines.
翻译:检测微弱的系统性分布偏移,并量化建模个体对政策或激励的异质性响应,在社会与经济科学中已有日益增多的实证应用。针对两个概率分布 $P$(原假设)和 $Q$(备择假设),本文研究偏离原假设 $P$ 并趋向备择假设 $Q$ 的弱分布偏移检测问题,其中偏移程度随样本量 $n$ 的增大而趋于零。我们基于最优输运理论,通过 $P$ 与 $Q$ 之间的位移插值构建弱分布偏移模型。研究基于Wasserstein距离的假设检验流程,推导出可检测的严苛条件,并利用经验过程在检测边界处精确刻画渐近第一类与第二类错误。通过两个实证案例——COVID-19后消费者支出的分布偏移,以及不同学科期刊中已发表统计检验p值的异质性——展示所提检验流程如何在实际数据集中建模与检测弱分布偏移。