This paper reconsiders the problem of testing the equality of two unspecified continuous distributions. The framework, which we propose, allows for readable and insightful data visualisation and helps to understand and quantify how two groups of data differ. We consider a useful weighted rank empirical process on (0,1) and utilise a grid-based approach, based on diadic partitions of (0,1), to discretize the continuous process and construct local simultaneous acceptance regions. These regions help to identify statistically significant deviations from the null model. In addition, the form of the process and its dicretization lead to a highly interpretable visualisation of distributional differences. We also introduce a new two-sample test, explicitly related to the visualisation. Numerical studies show that the new test procedure performs very well. We illustrate the use and diagnostic capabilities of our approach by an application to a known set of neuroscience data.
翻译:本文重新审视了检验两个未指定连续分布是否相等的问题。我们提出的框架能够实现可读性强且富有洞察力的数据可视化,有助于理解和量化两组数据之间的差异。我们考虑在(0,1)区间上一个有用的加权秩经验过程,并采用基于(0,1)区间二分划分的网格化方法,将连续过程离散化并构建局部同时接受域。这些区域有助于识别与零模型存在统计学显著偏差的部分。此外,该过程的形式及其离散化处理能够生成高度可解释的分布差异可视化结果。我们还提出了一种与可视化直接关联的新型双样本检验方法。数值研究表明,新检验程序性能优异。我们通过应用于一组已知的神经科学数据集,展示了本方法的实用价值及其诊断能力。