This paper addresses the challenge of obtaining precise demographic information at a fine-grained spatial level, a necessity for planning localized public services such as water distribution networks, or understanding local human impacts on the ecosystem. While population sizes are commonly available for large administrative areas, such as wards in India, practical applications often demand knowledge of population density at smaller spatial scales. We explore the integration of alternative data sources, specifically satellite-derived products, including land cover, land use, street density, building heights, vegetation coverage, and drainage density. Using a case study focused on Bangalore City, India, with a ward-level population dataset for 198 wards and satellite-derived sources covering 786,702 pixels at a resolution of 30mX30m, we propose a semiparametric Bayesian spatial regression model for obtaining pixel-level population estimates. Given the high dimensionality of the problem, exact Bayesian inference is deemed impractical; we discuss an approximate Bayesian inference scheme based on the recently proposed max-and-smooth approach, a combination of Laplace approximation and Markov chain Monte Carlo. A simulation study validates the reasonable performance of our inferential approach. Mapping pixel-level estimates to the ward level demonstrates the effectiveness of our method in capturing the spatial distribution of population sizes. While our case study focuses on a demographic application, the methodology developed here readily applies to count-type spatial datasets from various scientific disciplines, where high-resolution alternative data sources are available.
翻译:本文针对获取精细空间尺度人口信息的挑战展开研究,该信息对于规划本地化公共服务(如供水管网)或理解人类活动对生态系统的局部影响至关重要。虽然人口规模数据通常可在较大行政区域(如印度的行政区)获取,但实际应用往往需要更小空间尺度的人口密度信息。我们探索了整合替代数据源的方法,特别是卫星衍生产品,包括土地覆盖、土地利用、街道密度、建筑高度、植被覆盖度和排水密度。以印度班加罗尔市为案例,使用包含198个行政区的人口数据集及覆盖786,702个像素(分辨率30m×30m)的卫星衍生数据源,我们提出了一种半参数贝叶斯空间回归模型,用于获取像素级人口估计值。鉴于问题的高维特性,精确贝叶斯推断被认为不可行;我们讨论了一种基于近期提出的"最大平滑"方法的近似贝叶斯推断方案,该方法结合了拉普拉斯近似与马尔可夫链蒙特卡罗。模拟研究验证了我们推断方法的合理性能。将像素级估计映射至行政区层面,证明了我们的方法在捕捉人口规模空间分布方面的有效性。虽然本案例研究聚焦于人口统计应用,但所开发的方法可广泛适用于各科学领域的计数型空间数据集,特别是在具备高分辨率替代数据源的场景中。