Context-Conditioned Generative Models Enable Subnational Refinement of Sparse Humanitarian Surveys

Data scarcity limits inference in many scientific and policy domains. Survey data are essential for decision-making, but sparse samples often fail to capture fine spatial granularities. We evaluate normalizing flows, a generative model that learns complex data distributions and can be conditioned on exogenous contextual features, in controlled data scarcity scenarios. Across eight household survey datasets spanning six low-income or middle-income countries in the humanitarian domain, we show that context-conditioned generative models can refine sub-national survey distributions under severe data scarcity, and that performance increases systematically with the richness of the conditioning information. These findings support a general principle for survey data augmentation: generative models can improve sub-national estimates when the sparse sample retains sufficient support and contextual covariates encode relevant local heterogeneity. By learning full conditional distributions rather than point estimates, the approach provides fine-grained evidence for humanitarian decision-making and resource allocation.

翻译：数据稀缺限制了众多科学和政策领域的推断。调查数据对决策至关重要，但稀疏样本往往无法捕捉精细的空间粒度。我们评估了正则化流（一种能够学习复杂数据分布并可基于外生上下文特征进行条件化的生成模型）在受控数据稀缺场景中的性能。基于涵盖人道主义领域六个低收入或中等收入国家的八组家庭调查数据集，我们证明：上下文条件生成模型能在严重数据稀缺条件下优化子国家调查分布，且性能随条件信息丰富度的提升而系统性增强。这些发现支持调查数据增强的一般原则：当稀疏样本保留足够支撑且上下文协变量编码相关局部异质性时，生成模型可改善子国家估计。通过学习完整条件分布而非点估计，该方法为人道主义决策和资源分配提供了细粒度证据。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【博士论文】利用图结构加速稀疏计算

专知会员服务

18+阅读 · 2025年3月6日

【MIT博士论文】稀疏与低秩矩阵优化在机器学习应用中的进展

专知会员服务

19+阅读 · 2024年11月15日

【MIT博士论文】机器学习应用中稀疏和低秩矩阵优化的进展

专知会员服务

28+阅读 · 2024年11月9日

【MIT博士论文】稀疏和低秩矩阵优化在机器学习应用中的进展

专知会员服务

34+阅读 · 2024年10月17日