Modern large-scale data analysis increasingly faces the challenge of achieving computational efficiency as well as statistical accuracy, as classical statistically efficient methods often fall short in the first regard. In the context of testing monotonicity of a regression function, we propose FOMT (Fast and Optimal Monotonicity Test), a novel methodology tailored to meet these dual demands. FOMT employs a sparse collection of local tests, strategically generated at random, to detect violations of monotonicity scattered throughout the domain of the regression function. This sparsity enables significant computational efficiency, achieving sublinear runtime in most cases, and quasilinear runtime (i.e., linear up to a log factor) in the worst case. In contrast, existing statistically optimal tests typically require at least quadratic runtime. FOMT's statistical accuracy is achieved through the precise calibration of these local tests and their effective combination, ensuring both sensitivity to violations and control over false positives. More precisely, we show that FOMT separates the null and alternative hypotheses at minimax optimal rates over H\"older function classes of smoothness order in $(0,2]$. Further, when the smoothness is unknown, we introduce an adaptive version of FOMT, based on a modified Lepskii principle, which attains statistical optimality and meanwhile maintains the same computational complexity as if the intrinsic smoothness were known. Extensive simulations confirm the competitiveness and effectiveness of both FOMT and its adaptive variant.
翻译:现代大规模数据分析日益面临计算效率与统计准确性兼顾的挑战,经典统计高效方法往往在前者有所不足。在检验回归函数单调性的背景下,我们提出FOMT(快速最优单调性检验),这是一种专为满足双重需求而设计的新方法。FOMT采用稀疏的局部检验集合,通过随机策略生成,以检测散布在回归函数定义域中的单调性违反情况。这种稀疏性实现了显著的计算效率:在多数情况下达到亚线性运行时间,在最坏情况下达到拟线性运行时间(即至多包含对数因子的线性时间)。相比之下,现有统计最优检验通常至少需要二次运行时间。FOMT通过精确校准这些局部检验及其有效组合来实现统计准确性,确保了对违反情况的敏感性并控制了假阳性。更精确地说,我们证明FOMT在光滑度属于$(0,2]$的H\"older函数类上,以极小极大最优速率分离了原假设与备择假设。此外,当光滑度未知时,我们基于改进的Lepskii准则提出了FOMT的自适应版本,该版本在达到统计最优性的同时,保持了与已知内在光滑度时相同的计算复杂度。大量仿真实验证实了FOMT及其自适应变体的竞争力和有效性。