In small area estimation, it is sometimes necessary to use model-based methods to produce estimates in areas with little or no data. In official statistics, we often require that some aggregate of small area estimates agree with a national estimate for internal consistency purposes. Enforcing this agreement is referred to as benchmarking, and while methods currently exist to perform benchmarking, few are ideal for applications with non-normal outcomes and benchmarks with uncertainty. Fully Bayesian benchmarking is a theoretically appealing approach insofar as we can obtain posterior distributions conditional on a benchmarking constraint. However, existing implementations may be computationally prohibitive. In this paper, we critically review benchmarking methods in the context of small area estimation in low- and middle-income countries with binary outcomes and uncertain benchmarks, and propose a novel approach in which an unbenchmarked method that produces area-level samples can be combined with a rejection sampler or Metropolis-Hastings algorithm to produce benchmarked posterior distributions in a computationally efficient way. To illustrate the flexibility and efficiency of our approach, we provide comparisons to an existing benchmarking approach in a simulation, and applications to HIV prevalence and under-5 mortality estimation. Code implementing our methodology is available in the R package stbench.
翻译:在小区域估计中,有时需采用基于模型的方法对数据稀少或无数据区域进行估计。在官方统计中,为保持内部一致性,通常要求小区域估计的某种总量与国家估计保持一致。这种一致性约束称为基准检验。尽管现有方法可实现基准检验,但鲜有适用于非正态结果及含不确定性基准的应用场景。全贝叶斯基准检验在理论上具有吸引力——通过施加基准约束可获得后验分布,然而现有实现方式可能造成计算负担过重。本文在低收入和中等收入国家二值结果及不确定基准的小区域估计背景下,对现有基准检验方法进行批判性综述,并提出一种新方法:将可生成区域层级样本的无基准估计方法与拒绝采样器或梅特罗波利斯-黑斯廷斯算法相结合,以计算高效的方式生成基准化后验分布。为展示本方法的灵活性与效率,我们在模拟中与现有基准检验方法进行对比,并将其应用于HIV患病率及五岁以下儿童死亡率估计。实现本方法的代码已收录于R包stbench中。