In this paper, we focus on the high-dimensional double sparse structure, where the parameter of interest simultaneously encourages group-wise sparsity and element-wise sparsity in each group. By combining the Gilbert-Varshamov bound and its variants, we develop a novel lower bound technique for the metric entropy of the parameter space, specifically tailored for the double sparse structure over $\ell_u(\ell_q)$-balls with $u,q \in [0,1]$. We prove lower bounds on the estimation error using an information-theoretic approach, leveraging our proposed lower bound technique and Fano's inequality. To complement the lower bounds, we establish matching upper bounds through a direct analysis of constrained least-squares estimators and utilize results from empirical processes. A significant finding of our study is the discovery of a phase transition phenomenon in the minimax rates for $u,q \in (0, 1]$. Furthermore, we extend the theoretical results to the double sparse regression model and determine its minimax rate for estimation error. To tackle double sparse linear regression, we develop the DSIHT (Double Sparse Iterative Hard Thresholding) algorithm, demonstrating its optimality in the minimax sense. Finally, we demonstrate the superiority of our method through numerical experiments.
翻译:本文关注高维双稀疏结构,其中感兴趣参数同时促进组间稀疏性与组内元素级稀疏性。通过结合Gilbert-Varshamov界及其变体,我们发展了一种针对参数空间度量熵的新颖下界技术,该技术专为$u,q \in [0,1]$的$\ell_u(\ell_q)$-球上的双稀疏结构设计。利用信息论方法,基于所提出的下界技术与Fano不等式,我们证明了估计误差的下界。为补充下界,我们通过对约束最小二乘估计量的直接分析及经验过程结果,建立了匹配的上界。本研究的重要发现是揭示了$u,q \in (0, 1]$情形下最小最大速率中的相变现象。此外,我们将理论结果扩展至双稀疏回归模型,并确定了其估计误差的最小最大速率。为应对双稀疏线性回归问题,我们开发了DSIHT(双稀疏迭代硬阈值)算法,并证明了其最小最大意义上的最优性。最后,通过数值实验验证了本方法的优越性。