We theoretically analyze the model selection consistency of least absolute shrinkage and selection operator (Lasso), both with and without post-thresholding, for high-dimensional Ising models. For random regular (RR) graphs of size $p$ with regular node degree $d$ and uniform couplings $\theta_0$, it is rigorously proved that Lasso \textit{without post-thresholding} is model selection consistent in the whole paramagnetic phase with the same order of sample complexity $n=\Omega{(d^3\log{p})}$ as that of $\ell_1$-regularized logistic regression ($\ell_1$-LogR). This result is consistent with the conjecture in Meng, Obuchi, and Kabashima 2021 using the non-rigorous replica method from statistical physics and thus complements it with a rigorous proof. For general tree-like graphs, it is demonstrated that the same result as RR graphs can be obtained under mild assumptions of the dependency condition and incoherence condition. Moreover, we provide a rigorous proof of the model selection consistency of Lasso with post-thresholding for general tree-like graphs in the paramagnetic phase without further assumptions on the dependency and incoherence conditions. Experimental results agree well with our theoretical analysis.
翻译:我们从理论上分析了最小绝对收缩与选择算子(Lasso)在高维Ising模型中(包含后阈值处理情况与不含后阈值处理情况)的模型选择一致性。对于规模为$p$、正则节点度为$d$、均匀耦合系数为$\theta_0$的随机正则图,严格证明了在顺磁相区域内,不含后阈值处理的Lasso模型选择一致性成立,其样本复杂度阶数为$n=\Omega{(d^3\log{p})}$,与基于$\ell_1$正则化的逻辑回归相同。该结果验证了Meng、Obuchi与Kabashima在2021年利用统计物理学中非严格的复制方法提出的猜想,并为其提供了严格证明的补充。对于一般树状图,证明在依赖条件与非相干性条件的温和假设下,可得到与随机正则图相同的结果。此外,我们在无需依赖与非相干性条件额外假设的前提下,严格证明了顺磁相区域内一般树状图经后阈值处理的Lasso模型选择一致性。实验结果与理论分析高度吻合。