This paper addresses the multiple two-sample test problem in a graph-structured setting, which is a common scenario in fields such as Spatial Statistics and Neuroscience. Each node $v$ in fixed graph deals with a two-sample testing problem between two node-specific probability density functions (pdfs), $p_v$ and $q_v$. The goal is to identify nodes where the null hypothesis $p_v = q_v$ should be rejected, under the assumption that connected nodes would yield similar test outcomes. We propose the non-parametric collaborative two-sample testing (CTST) framework that efficiently leverages the graph structure and minimizes the assumptions over $p_v$ and $q_v$. Our methodology integrates elements from f-divergence estimation, Kernel Methods, and Multitask Learning. We use synthetic experiments and a real sensor network detecting seismic activity to demonstrate that CTST outperforms state-of-the-art non-parametric statistical tests that apply at each node independently, hence disregard the geometry of the problem.
翻译:本文研究了图结构场景下的多重双样本检验问题,这是空间统计和神经科学等领域的常见情形。固定图中的每个节点$v$需要处理两个节点特定概率密度函数$p_v$和$q_v$之间的双样本检验问题。目标是在假设相连节点会产生相似检验结果的前提下,识别应拒绝原假设$p_v = q_v$的节点。我们提出非参数协同双样本检验(CTST)框架,该框架能有效利用图结构并最小化对$p_v$和$q_v$的假设条件。我们的方法整合了f-散度估计、核方法以及多任务学习等要素。通过合成实验和检测地震活动的真实传感器网络,我们证明CTST优于每个节点独立应用、忽略问题几何结构的现有最优非参数统计检验方法。