Contextual Geospatial Features for Identifying Informal Environmental-Health Hazards Undetectable from Satellites: A ULAB Case Study

Reliable, scalable detection of informal, small-scale environmental-health hazards (used lead-acid battery (ULAB) recycling, household-scale e-waste burning, indoor mercury amalgamation, brick kilns, small tanneries) remains an unsolved problem. These operations are invisible to satellites and absent from formal registries, yet disproportionately harm low-income populations in low- and middle-income countries. This paper articulates the problem class and explores a possible response: contextual geospatial features, with case-specific feature design informed by domain expertise. We use ULAB recycling as a demonstration case, drawing on 164 verified sites in Bangladesh and India from Pure Earth's Toxic Sites Identification Programme. At this sample size, five-fold cross-validation on the training set cannot statistically distinguish the engineered contextual features from a simple two-feature socio-demographic baseline. The added value only becomes visible when we evaluate outside the training set. On 172 held-out informal-recycling sites in non-NCR India and Bangladesh, the model assigns scores several times higher than to matched random urban controls; and on an independent set of 131 regulatory-confirmed formal recyclers, informal sites score materially higher than formal ones in non-NCR India, indicating that the model is picking up informal-recycler-specific structure rather than generic industrial signal. We frame these results as exploratory rather than confirmatory: label sparsity, gaps in point-of-interest coverage, and untested transfer beyond South Asia all remain open. We close with seven open problems and invite the environmental-health and geospatial machine-learning communities to engage with informal-hazard detection as a class of problems worth solving.

翻译：可靠、可扩展地检测非正规小规模环境健康危害（废旧铅酸电池（ULAB）回收、家庭级电子垃圾焚烧、室内汞齐化、砖窑、小型制革厂）仍是一个未解决的问题。这些作业在卫星图像中不可见，且未纳入正式登记体系，但对低收入和中等收入国家的贫困人口造成不成比例的伤害。本文阐述了这一问题类别，并探索了一种可能的应对方案：基于领域专业知识设计针对具体案例的上下文地理空间特征。我们以ULAB回收作为示范案例，利用“纯净地球”有毒场地识别计划中收集的印度和孟加拉国164个已验证站点数据。在该样本量下，训练集上的五折交叉验证无法从统计学上区分所设计的上下文特征与简单的双特征社会人口基线模型。仅当在训练集外进行评估时，其附加价值才显现。在印度非国家首都区（NCR）及孟加拉国的172个留出非正规回收站点上，模型评分比匹配的随机城市对照点高出数倍；而在131个经监管确认的正规回收商独立数据集中，印度非NCR地区的非正规站点评分显著高于正规站点，表明模型捕捉到了非正规回收商特有的结构特征，而非泛泛的工业信号。我们将这些结果定性为探索性而非验证性：标签稀疏性、兴趣点覆盖缺口以及南亚地区以外的迁移性未经验证等问题仍有待解决。最后我们提出七个未解决问题，并邀请环境健康与地理空间机器学习社区将非正规危害检测作为一类值得攻克的问题共同参与研究。

相关内容

健康

关注 27

健康是指一个人在身体、精神和社会等方面都处于良好的状态。健康包括两个方面的内容：

一是主要脏器无疾病，身体形态发育良好，体形均匀，人体各系统具有良好的生理功能，有较强的身体活动能力和劳动能力，这是对健康最基本的要求；

二是对疾病的抵抗能力较强，能够适应环境变化，各种生理刺激以及致病因素对身体的作用。传统的健康观是“无病即健康”，现代人的健康观是整体健康，世界卫生组织提出“健康不仅是躯体没有疾病，还要具备心理健康、社会适应良好和有道德”。因此，现代人的健康内容包括：躯体健康、心理健康、心灵健康、社会健康、智力健康、道德健康、环境健康等。健康是人的基本权利。健康是人生的第一财富。

《在GNSS信号降级环境中利用共识实现无人机集群稳健协调》

专知会员服务

27+阅读 · 2025年8月8日

【牛津大学博士论文】面向电子健康记录的深度学习:风险预测、可解释性和不确定性，200页pdf

专知会员服务

46+阅读 · 2023年7月18日

《利用深度学习检测卫星图像中的武装冲突破坏情况》93页论文

专知会员服务

35+阅读 · 2023年5月26日

《空间能力矩阵：卫星能力表征的开发和应用》北约科技组织2022最新报告

专知会员服务

42+阅读 · 2022年11月20日