We propose the density ratio permutation test, a hypothesis test that assesses whether the ratio between two densities is proportional to a known function based on independent samples from each distribution. The test uses an efficient Markov Chain Monte Carlo scheme to draw weighted permutations of the pooled data, yielding exchangeable samples and finite sample validity. For power, if the statistic is an integral probability metric, our procedure is consistent under mild assumptions on the defining function class; specializing to a reproducing kernel Hilbert space, we introduce the shifted maximum mean discrepancy and prove minimax optimality of our test when a normalized difference between the densities lies in a Sobolev ball. We extend to the case of an unknown density ratio by estimating it on an independent training sample and derive type~I error bounds in terms of the estimation error as well as power results. This allows adapting our method to conditional two sample testing, making it a versatile tool for assessing covariate-shift and related assumptions, which frequently arise in transfer learning and causal inference. Finally, we validate our theoretical findings through experiments on both simulated and real-world datasets.
翻译:本文提出密度比置换检验,这是一种假设检验方法,用于评估两个密度之间的比例是否与已知函数成比例,检验基于从每个分布中独立抽取的样本。该检验采用高效的马尔可夫链蒙特卡洛方案来绘制合并数据的加权置换,从而产生可交换样本并确保有限样本的有效性。在检验功效方面,若检验统计量为积分概率度量,则在定义函数类满足温和假设的条件下,我们的方法具有一致性;特别地,针对再生核希尔伯特空间,我们引入了平移最大均值差异,并证明了当密度间的归一化差异位于Sobolev球内时,我们的检验达到极小极大最优性。我们进一步将方法扩展到未知密度比的情形,通过在独立训练样本上估计密度比,并根据估计误差推导了第一类错误界以及功效结果。这使得我们的方法能够适用于条件双样本检验,成为评估协变量偏移及相关假设的通用工具,这些假设在迁移学习和因果推断中经常出现。最后,我们通过模拟和真实数据集的实验验证了理论结果。