We propose two general methods for constructing robust permutation tests under data corruption. The proposed tests effectively control the non-asymptotic type I error under data corruption, and we prove their consistency in power under minimal conditions. This contributes to the practical deployment of hypothesis tests for real-world applications with potential adversarial attacks. One of our methods inherently ensures differential privacy, further broadening its applicability to private data analysis. For the two-sample and independence settings, we show that our kernel robust tests are minimax optimal, in the sense that they are guaranteed to be non-asymptotically powerful against alternatives uniformly separated from the null in the kernel MMD and HSIC metrics at some optimal rate (tight with matching lower bound). Finally, we provide publicly available implementations and empirically illustrate the practicality of our proposed tests.
翻译:我们提出了两种在数据污染下构建鲁棒置换检验的通用方法。所提出的检验方法能有效控制数据污染下的非渐近第一类错误,并在最小条件下证明了其功效一致性。这有助于假设检验在实际应用中面临潜在对抗攻击时的实际部署。其中一种方法本质上确保了差分隐私,进一步拓宽了其在私有数据分析中的适用性。针对双样本和独立性检验场景,我们证明了所提出的核鲁棒检验具有极小极大最优性,即它们保证在核MMD和HSIC度量下,以某种最优速率(与匹配下界紧致)均匀偏离零假设的备择假设上具有非渐近功效。最后,我们提供了公开可用的实现,并通过实验展示了所提出检验方法的实用性。