Estimation of signal-to-noise ratios and residual variances in high-dimensional linear models has various important applications including, e.g. heritability estimation in bioinformatics. One commonly used estimator, usually referred to as REML, is based on the likelihood of the random effects model, in which both the regression coefficients and the noise variables are respectively assumed to be i.i.d Gaussian random variables. In this paper, we aim to establish the consistency and asymptotic distribution of the REML estimator for the SNR, when the actual coefficient vector is fixed, and the actual noise is heteroscedastic and correlated, at the cost of assuming the entries of the design matrix are independent and skew-free. The asymptotic variance can be also consistently estimated when the noise is heteroscedastic but uncorrelated. Extensive numerical simulations illustrate our theoretical findings and also suggest some assumptions imposed in our theoretical results are likely relaxable.
翻译:信噪比与残差方差在高维线性模型中的估计具有多种重要应用,例如生物信息学中的遗传力估计。通常被称为REML的常用估计器基于随机效应模型的似然函数,其中回归系数和噪声变量分别被假定为独立同分布的高斯随机变量。本文旨在建立当实际系数向量固定、实际噪声存在异方差且相关时,REML估计量信噪比的一致性及其渐近分布,其代价是假设设计矩阵的条目独立且无偏斜。当噪声存在异方差但无相关时,渐近方差亦可得到一致估计。大量的数值模拟验证了我们的理论发现,同时表明理论结果中施加的部分假设很可能可以放宽。