We consider statistical inference in high-dimensional regression problems under affine constraints on the parameter space. The theoretical study of this is motivated by the study of genetic determinants of diseases, such as diabetes, using external information from mediating protein expression levels. Specifically, we develop rigorous methods for estimating genetic effects on diabetes-related continuous outcomes when these associations are constrained based on external information about genetic determinants of proteins, and genetic relationships between proteins and the outcome of interest. In this regard, we discuss multiple candidate estimators and study their theoretical properties, sharp large sample optimality, and numerical qualities under a high-dimensional proportional asymptotic framework.
翻译:本文研究高维回归问题在参数空间仿射约束下的统计推断。该理论研究的动机源于利用中介蛋白表达水平的外部信息研究疾病(如糖尿病)的遗传决定因素。具体而言,我们开发了严谨的方法,用于在基于蛋白质遗传决定因素的外部信息以及蛋白质与目标结果之间遗传关系的约束条件下,估计遗传因素对糖尿病相关连续结果的影响。为此,我们讨论了多种候选估计量,并在高维比例渐近框架下研究了它们的理论性质、精确大样本最优性及数值特性。