In environmental monitoring, data collection is often costly, sparse, and shaped by urgent public-health needs. This is particularly true for cancer-causing PFAS (Per- and polyfluoroalkyl substances) contamination, where discussions with domain experts and environmental organizations highlight the need to strategically identify high-risk, under-observed regions under tight sampling budgets. More broadly, similar challenges arise in disaster response and public health settings, where dynamic environments make it essential to efficiently uncover hidden targets from limited ground truth. Yet sparse and biased geospatial labels limit the applicability of existing learning-based methods, such as reinforcement learning. To address this, we propose a unified geospatial discovery framework that integrates active learning, online meta-learning, and concept-guided reasoning. Our approach introduces two key innovations built on a shared notion of *concept relevance*, capturing how domain-specific factors influence target presence: a *concept-weighted uncertainty sampling strategy*, where uncertainty is modulated by learned relevance from readily available concepts such as land cover and source proximity; and a *relevance-aware meta-batch formation strategy* that promotes semantic diversity during online-meta updates, improving generalization in dynamic environments. We evaluate our framework on PFAS contamination discovery as a real-world inspired environmental monitoring task, demonstrating robust target discovery under limited data and changing conditions.
翻译:在环境监测中,数据采集往往成本高昂、分布稀疏,且受紧迫的公共卫生需求驱动。以致癌的全氟和多氟烷基物质(PFAS)污染为例,与领域专家及环保组织的讨论表明,在严格的采样预算下,需要战略性地识别高风险但观测不足的区域。更广泛地,类似的挑战也出现在灾害响应和公共卫生领域——动态变化的环境使得从有限的地面真值中高效挖掘隐藏目标变得至关重要。然而,稀疏且有偏的时空标签制约了现有学习方法(如强化学习)的适用性。为解决此问题,我们提出一个统一的地理发现框架,该框架融合了主动学习、在线元学习和概念引导推理。基于共享的“概念相关性”定义(捕捉领域特定因素如何影响目标存在性),我们提出两项核心创新:一是“概念加权不确定性采样策略”,通过从现有概念(如土地覆盖和污染源距离)中学习的相关性来调制不确定性;二是“相关性感知的元批次构建策略”,在在线元更新过程中促进语义多样性,从而提升动态环境下的泛化能力。我们将该框架应用于PFAS污染发现这一现实场景的环境监测任务,验证了其在有限数据和变化条件下实现稳健目标发现的能力。