Selecting interactions from an ultrahigh-dimensional statistical model with $n$ observations and $p$ variables when $p\gg n$ is difficult because the number of candidates for interactions is $p(p-1)/2$ and a selected model should satisfy the strong hierarchical (SH) restriction. A new method called the SHL0 is proposed to overcome the difficulty. The objective function of the SHL0 method is composed of a loglikelihood function and an $L_0$ penalty. A well-known approach in theoretical computer science called local combinatorial optimization is used to optimize the objective function. We show that any local solution of the SHL0 is consistent and enjoys the oracle properties, implying that it is unnecessary to use a global solution in practice. Three additional advantages are: a tuning parameter is used to penalize the main effects and interactions; a closed-form expression can derive the tuning parameter; and the idea can be extended to arbitrary ultrahigh-dimensional statistical models. The proposed method is more flexible than the previous methods for selecting interactions. A simulation study of the research shows that the proposed SHL0 outperforms its competitors.
翻译:当观测数$n$和变量数$p$满足$p\gg n$时,从超高维统计模型中筛选交互作用是一项困难的任务,因为候选交互项的数量高达$p(p-1)/2$,且所选模型需满足强层次性(SH)约束。为克服这一困难,本文提出一种名为SHL0的新方法。SHL0方法的目标函数由对数似然函数与$L_0$惩罚项构成。我们采用理论计算机科学中成熟的局部组合优化技术来优化该目标函数。研究表明,SHL0的任意局部解均具有一致性并满足Oracle性质,这意味着在实践中无需追求全局最优解。该方法还具有三个额外优势:可通过调节参数分别惩罚主效应项与交互项;该调节参数存在闭式解;其核心思想可推广至任意超高维统计模型。相较于现有交互作用筛选方法,所提方法具有更强的灵活性。仿真研究表明,SHL0方法在性能上优于现有同类方法。