When estimating area means, direct estimators based on area-specific data, are usually consistent under the sampling design without model assumptions. However, they are inefficient if the area sample size is small. In small area estimation, model assumptions linking the areas are used to "borrow strength" from other areas. The basic area-level model provides design-consistent estimators but error variances are assumed to be known. In practice, they are estimated with the (scarce) area-specific data. These estimators are inefficient, and their error is not accounted for in the associated mean squared error estimators. Unit-level models do not require to know the error variances but do not account for the survey design. Here we describe a unified estimator of an area mean that may be obtained both from an area-level model or a unit-level model and based on consistent estimators of the model error variances as the number of areas increases. We propose bootstrap mean squared error estimators that account for the uncertainty due to the estimation of the error variances. We show a better performance of the new small area estimators and our bootstrap estimators of the mean squared error. We apply the results to education data from Colombia.
翻译:在估计区域均值时,基于区域特定数据的直接估计量通常能在无模型假设的抽样设计下保持一致性。然而,当区域样本量较小时,这类估计量效率较低。在小区域估计中,通过连接各区域的模型假设来"借用"其他区域的信息。基础区域层次模型虽能提供设计一致性估计量,但需假设误差方差已知。实践中,这些方差需利用(稀缺的)区域特定数据进行估计,导致估计效率低下,且相关均方误差估计量未能涵盖此估计误差。单元层次模型虽无需已知误差方差,却未考虑调查设计的影响。本文提出一种区域均值的统一估计量,该估计量既可从区域层次模型也可从单元层次模型推导得出,并基于模型误差方差在区域数量增加时的一致性估计量。我们提出能涵盖误差方差估计不确定性的自助法均方误差估计量。研究表明,新提出的小区域估计量及相应的自助法均方误差估计量具有更优性能。我们将该方法应用于哥伦比亚教育数据进行了实证分析。