Small area estimation using survey data can be achieved by using either a design-based or a model-based inferential approach. Design-based direct estimators are generally preferable because of their consistency, asymptotic normality, and reliance on fewer assumptions. However, when data are sparse at the desired area level, as is often the case when measuring rare events, these direct estimators can have extremely large uncertainty, making a model-based approach preferable. A model-based approach with a random spatial effect borrows information from surrounding areas at the cost of inducing shrinkage. As a result, estimates may be over-smoothed and inconsistent with design-based estimates at higher area levels when aggregated. We propose two unit-level Bayesian models for small area estimation of rare event prevalence which use design-based direct estimates at a higher area level to increase consistency in aggregation. This model framework is designed to accommodate sparse data obtained from two-stage stratified cluster sampling, which is particularly relevant to applications in low- and middle-income countries. After introducing the model framework and its implementation, we conduct a simulation study to evaluate its properties and apply it to the estimation of the neonatal mortality rate in Zambia, using 2014 Demographic Health Surveys data.
翻译:基于调查数据的小区域估计可通过设计基础或模型基础的推断方法实现。设计基础直接估计量因其一致性、渐近正态性及对较少假设的依赖而通常更受青睐。然而,当目标区域级别的数据稀疏时(这在测量罕见事件时尤为常见),这些直接估计量可能具有极大的不确定性,此时模型基础方法更为可取。包含随机空间效应的模型基础方法通过借用周边区域信息实现估计,但代价是引入收缩效应。这可能导致估计结果过度平滑,且在聚合到更高级别区域时与设计基础估计量不一致。本文提出两种用于罕见事件流行率小区域估计的单元级贝叶斯模型,这些模型利用更高级别区域的设计基础直接估计量来提升聚合一致性。该模型框架专门适用于处理通过两阶段分层整群抽样获得的稀疏数据,这对中低收入国家的应用场景尤为重要。在介绍模型框架及其实现方法后,我们通过模拟研究评估其统计特性,并应用2014年人口健康调查数据对赞比亚新生儿死亡率进行实证估计。