We consider the problem of model selection when grouping structure is inherent within the regressors. Using a Bayesian approach, we model the mean vector by a one-group global-local shrinkage prior belonging to a broad class of such priors that includes the horseshoe prior. In the context of variable selection, this class of priors was studied by Tang et al. (2018). A modified form of the usual class of global-local shrinkage priors with polynomial tail on the group regression coefficients is proposed. The resulting threshold rule selects the active group if within a group, the ratio of the $L_2$ norm of the posterior mean of its group coefficient to that of the corresponding ordinary least square group estimate is greater than a half. In the theoretical part of this article, we have used the global shrinkage parameter either as a tuning one or an empirical Bayes estimate of it depending on the knowledge regarding the underlying sparsity of the model. When the proportion of active groups is known, using $\tau$ as a tuning parameter, we have proved that our method is oracle. In case this proportion is unknown, we propose an empirical Bayes estimate of $\tau$. Even if this empirical Bayes estimate is used, then also our half-thresholding rule captures the truly important groups and obtains optimal estimation rate of the group coefficients simultaneously. Though our theoretical works rely on a special form of the design matrix, for general design matrices also, our simulation results show that the half-thresholding rule yields results similar to that of Yang and Narisetty (2020). As a consequence of this, in a high dimensional sparse group selection problem, instead of using the so-called `gold standard' spike and slab prior, one can use the one-group global-local shrinkage priors with polynomial tail to obtain similar results.
翻译:我们考虑回归变量中固有分组结构时的模型选择问题。采用贝叶斯方法,我们通过属于广泛全局-局部收缩先验类(包含马蹄铁先验)的单组全局-局部收缩先验对均值向量建模。在变量选择背景下,Tang等人(2018)曾研究此类先验。本文提出了针对群组回归系数具有多项式尾部的改进型全局-局部收缩先验类。所得阈值规则判定:若群组内其后验均值$L_2$范数与相应普通最小二乘群组估计$L_2$范数之比大于二分之一,则选择该活跃群组。在理论部分,我们根据模型底层稀疏性的认知,将全局收缩参数分别作为调优参数或其经验贝叶斯估计量。当活跃群组比例已知时,以$\tau$作为调优参数,我们证明了该方法具有甲骨文性质。若该比例未知,我们提出了$\tau$的经验贝叶斯估计量。即使使用此经验贝叶斯估计,我们的半阈值规则仍能捕捉真正重要的群组,并同时获得群组系数的最优估计速率。虽然我们的理论工作依赖于设计矩阵的特殊形式,但针对一般设计矩阵的模拟结果表明,半阈值规则可产生与Yang和Narisetty(2020)相似的结果。由此推论,在高维稀疏群组选择问题中,可使用具有多项式尾部的单组全局-局部收缩先验替代所谓的"黄金标准"尖峰和平板先验,以获得相似结果。