A major public health concern in the United States (US) is gun-related deaths. The number of gun injuries largely varies spatially because of county-wise heterogeneity of race, sex, age, and income distributions. But still, a major challenge is to locally identify the influential socio-economic factors behind these firearm fatality incidents. For a diverging number of predictors, a rich literature exists regarding SCAD under the independence framework; however, a vacuum remains when discussing local variable selection for spatially correlated, over-dispersed data. This research presents a two-step localized variable selection and inference framework for spatially indexed gunshot fatality data. In the first step, we select variables locally using the SCAD penalty for specific locations where the number of gunshot incidents exceeds a threshold. For these locations, after selecting the predictors, we proceed to the next step, which involves examining the directional variation in the latent spatial neighborhood structure. We further discuss the theoretical properties of this county-specific local variable selection under infill asymptotics. This method has threefold advantages: (i) this method selects the variables locally, (ii) this method provides inference about directional variation of a selected predictor, and (iii) instead of assuming the spatial neighborhood structure in an ad hoc manner, this method identifies the specific type of spatial neighborhood structure that is most appropriate for modeling the random effects.
翻译:枪支相关死亡是美国一项重大的公共卫生问题。由于各县在种族、性别、年龄及收入分布上存在异质性,枪支伤害数量在空间上呈现显著差异。然而,如何从局部识别这些枪支致死事件背后具有影响力的社会经济因素,仍然是一个主要挑战。针对预测变量数量发散的情形,现有大量文献在独立框架下讨论了SCAD方法;然而,在讨论空间相关、过度离散数据的局部变量选择时,相关研究仍属空白。本研究针对空间索引的枪击致死数据,提出了一种两步法的局部变量选择与推断框架。第一步,我们使用SCAD惩罚对枪击事件数量超过阈值的特定位置进行局部变量选择。对于这些位置,在完成预测变量选择后,我们进入第二步,即检验潜在空间邻域结构的方向性变异。我们进一步讨论了在填充渐近条件下,这种针对特定县的局部变量选择的理论性质。该方法具有三重优势:(i)能够实现局部变量选择;(ii)能够提供关于所选预测变量方向性变异的推断;(iii)无需以临时方式假设空间邻域结构,而是识别出最适合建模随机效应的特定类型空间邻域结构。