When estimating area means, direct estimators based on area-specific data, are usually consistent under the sampling design without model assumptions. However, they are inefficient if the area sample size is small. In small area estimation, model assumptions linking the areas are used to "borrow strength" from other areas. The basic area-level model provides design-consistent estimators but error variances are assumed to be known. In practice, they are estimated with the (scarce) area-specific data. These estimators are inefficient, and their error is not accounted for in the associated mean squared error estimators. Unit-level models do not require to know the error variances but do not account for the survey design. Here we describe a unified estimator of an area mean that may be obtained both from an area-level model or a unit-level model and based on consistent estimators of the model error variances as the number of areas increases. We propose bootstrap mean squared error estimators that account for the uncertainty due to the estimation of the error variances. We show a better performance of the new small area estimators and our bootstrap estimators of the mean squared error. We apply the results to education data from Colombia.
翻译:在估计区域均值时,基于区域特定数据的直接估计量通常能在抽样设计下保持一致性,无需模型假设。然而,若区域样本量较小,这类估计量效率较低。在小区域估计中,通过连接各区域的模型假设来“借用”其他区域的信息。基础的区域层面模型能提供设计一致的估计量,但需假设误差方差已知。实践中,这些方差需利用(稀缺的)区域特定数据进行估计。此类估计量效率不高,且其误差未在相应的均方误差估计量中得到反映。单元层面模型虽无需已知误差方差,却未考虑调查设计的影响。本文提出一种区域均值的统一估计量,该估计量既可从区域层面模型导出,亦可从单元层面模型获得,并基于模型误差方差在区域数量增加时的一致估计量。我们提出了自举均方误差估计量,以纳入误差方差估计所引入的不确定性。研究表明,新提出的小区域估计量及其自举均方误差估计量具有更优的性能。我们将相关结果应用于哥伦比亚的教育数据。