When estimating area means, direct estimators based on area-specific data, are usually consistent under the sampling design without model assumptions. However, they are inefficient if the area sample size is small. In small area estimation, model assumptions linking the areas are used to "borrow strength" from other areas. The basic area-level model provides design-consistent estimators but error variances are assumed to be known. In practice, they are estimated with the (scarce) area-specific data. These estimators are inefficient, and their error is not accounted for in the associated mean squared error estimators. Unit-level models do not require to know the error variances but do not account for the survey design. Here we describe a unified estimator of an area mean that may be obtained both from an area-level model or a unit-level model and based on consistent estimators of the model error variances as the number of areas increases. We propose bootstrap mean squared error estimators that account for the uncertainty due to the estimation of the error variances. We show a better performance of the new small area estimators and our bootstrap estimators of the mean squared error. We apply the results to education data from Colombia.
翻译:在估计区域均值时,基于区域特定数据的直接估计量通常无需模型假设即可在抽样设计下保持一致,但当区域样本量较小时其效率低下。在小区域估计中,通过连接各区域的模型假设可从其他区域"借用强度"。基础区域级模型虽能提供设计一致的估计量,但需假设误差方差已知。实际应用中,这些方差需通过(稀缺的)区域特定数据进行估计,导致估计量效率低下,且其误差未在相关均方误差估计量中得到体现。单位级模型无需预先知晓误差方差,但未考虑调查设计。本文提出一种统一的区域均值估计方法,既可通过区域级模型也可通过单位级模型实现,并基于随区域数量增加而保持一致的模型误差方差估计量。我们提出能反映误差方差估计不确定性的自助法均方误差估计量,证明新型小区域估计量及自助法均方误差估计量具有更优性能,并将该方法应用于哥伦比亚教育数据。