Detecting Weak Distribution Shifts via Displacement Interpolation

Detecting weak, systematic distribution shifts and quantitatively modeling individual, heterogeneous responses to policies or incentives have found increasing empirical applications in social and economic sciences. Given two probability distributions $P$ (null) and $Q$ (alternative), we study the problem of detecting weak distribution shift deviating from the null $P$ toward the alternative $Q$, where the level of deviation vanishes as a function of $n$, the sample size. We propose a model for weak distribution shifts via displacement interpolation between $P$ and $Q$, drawing from the optimal transport theory. We study a hypothesis testing procedure based on the Wasserstein distance, derive sharp conditions under which detection is possible, and provide the exact characterization of the asymptotic Type I and Type II errors at the detection boundary using empirical processes. We demonstrate how the proposed testing procedure works in modeling and detecting weak distribution shifts in real data sets using two empirical examples: distribution shifts in consumer spending after COVID-19, and heterogeneity in the published p-values of statistical tests in journals across different disciplines.

翻译：检测微弱的系统性分布偏移，并量化建模个体对政策或激励的异质性响应，在社会与经济科学中已有日益增多的实证应用。针对两个概率分布 $P$（原假设）和 $Q$（备择假设），本文研究偏离原假设 $P$ 并趋向备择假设 $Q$ 的弱分布偏移检测问题，其中偏移程度随样本量 $n$ 的增大而趋于零。我们基于最优输运理论，通过 $P$ 与 $Q$ 之间的位移插值构建弱分布偏移模型。研究基于Wasserstein距离的假设检验流程，推导出可检测的严苛条件，并利用经验过程在检测边界处精确刻画渐近第一类与第二类错误。通过两个实证案例——COVID-19后消费者支出的分布偏移，以及不同学科期刊中已发表统计检验p值的异质性——展示所提检验流程如何在实际数据集中建模与检测弱分布偏移。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【ACL2020】多模态信息抽取，365页ppt

专知会员服务

151+阅读 · 2020年7月6日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日