Given a sample of covariate-response pairs, we consider the subgroup selection problem of identifying a subset of the covariate domain where the regression function exceeds a pre-determined threshold. We introduce a computationally-feasible approach for subgroup selection in the context of multivariate isotonic regression based on martingale tests and multiple testing procedures for logically-structured hypotheses. Our proposed procedure satisfies a non-asymptotic, uniform Type I error rate guarantee with power that attains the minimax optimal rate up to poly-logarithmic factors. Extensions cover classification, isotonic quantile regression and heterogeneous treatment effect settings. Numerical studies on both simulated and real data confirm the practical effectiveness of our proposal.
翻译:考虑协变量-响应变量对的样本,我们研究子群选择问题,即识别协变量域中回归函数超过预设阈值的子集。针对多元保序回归场景,我们提出了一种基于鞅检验和逻辑结构假设多重检验程序的计算可行方法。所提程序满足非渐近统一的I类错误率保证,其统计功效在仅差多对数因子的意义下达到极小极大最优速率。方法扩展覆盖分类问题、保序分位数回归及异质处理效应设定。基于模拟与真实数据的数值研究证实了本方法的实际有效性。