Context-Conditioned Generative Models Enable Subnational Refinement of Sparse Humanitarian Surveys

Data scarcity limits inference in many scientific and policy domains. Survey data are essential for decision-making, but sparse samples often fail to capture fine spatial granularities. We evaluate normalizing flows, a generative model that learns complex data distributions and can be conditioned on exogenous contextual features, in controlled data scarcity scenarios. Across eight household survey datasets spanning six low-income or middle-income countries in the humanitarian domain, we show that context-conditioned generative models can refine sub-national survey distributions under severe data scarcity, and that performance increases systematically with the richness of the conditioning information. These findings support a general principle for survey data augmentation: generative models can improve sub-national estimates when the sparse sample retains sufficient support and contextual covariates encode relevant local heterogeneity. By learning full conditional distributions rather than point estimates, the approach provides fine-grained evidence for humanitarian decision-making and resource allocation.

翻译：数据稀缺限制了科学及政策领域的许多推断能力。调查数据对决策至关重要，但稀疏样本往往无法捕捉精细的空间粒度。我们评估了归一化流——一种能学习复杂数据分布并可基于外生上下文特征进行条件生成的生成模型——在受控数据稀缺场景下的表现。针对人道主义领域覆盖六个中低收入国家的八项家庭调查数据集，我们表明，在数据严重稀缺的情况下，条件生成模型能够优化子国家层面的调查分布，且性能随着条件信息的丰富程度系统性提升。这些发现支持一项调查数据扩充的通用原则：当稀疏样本保留充分支持且上下文协变量编码相关局部异质性时，生成模型可改进子国家层面的估计。通过学习完整条件分布而非点估计，该方法为人道主义决策及资源分配提供了细粒度的证据。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【ICML2026】WeatherSyn：面向天气报告生成的指令微调多模态大语言模型

专知会员服务

8+阅读 · 5月11日

超越生成式人工智能：用于临床预测、反事实推断与规划的世界模型

专知会员服务

22+阅读 · 2025年11月23日

大规模语言模型生成的合成数据中的质量、多样性与复杂性效应综述

专知会员服务

32+阅读 · 2024年12月10日