We developed a statistical inference method applicable to a broad range of generalized linear models (GLMs) in high-dimensional settings, where the number of unknown coefficients scales proportionally with the sample size. Although a pioneering inference method has been developed for logistic regression, which is a specific instance of GLMs, it is not feasible to apply this method directly to other GLMs because of unknown hyper-parameters. In this study, we addressed this limitation by developing a new inference method designed for a certain class of GLMs. Our method is based on the adjustment of asymptotic normality in high dimensions and is feasible in the sense that it is possible even with unknown hyper-parameters. Specifically, we introduce a novel convex loss-based estimator and its associated system, which are essential components of inference. Next, we devise a methodology for identifying the system parameters required by the method. Consequently, we construct confidence intervals for GLMs in a high-dimensional regime. We prove that our proposed method has desirable theoretical properties, such as strong consistency and exact coverage probability. Finally, we experimentally confirmed its validity.
翻译:我们开发了一种适用于高维环境下广义线性模型(GLM)的统计推断方法,其中未知系数的数量与样本量成比例增长。尽管针对逻辑回归(GLM的特例)已提出开创性的推断方法,但由于未知超参数的存在,该方法无法直接推广至其他GLM。本研究通过为特定类别的GLM设计新型推断方法解决了这一局限。我们的方法基于高维渐近正态性的调整,且具有可行性——即便存在未知超参数仍可实施。具体而言,我们引入了一种新型凸损失估计量及其关联系统,这些是推断的核心组件。随后,我们设计了识别该方法所需系统参数的策略,从而在高维框架下构建了GLM的置信区间。我们证明了所提方法具有强相合性和精确覆盖概率等理想理论性质,并通过实验验证了其有效性。