A Class of Models for Large Zero-inflated Spatial Data

Spatially correlated data with an excess of zeros, usually referred to as zero-inflated spatial data, arise in many disciplines. Examples include count data, for instance, abundance (or lack thereof) of animal species and disease counts, as well as semi-continuous data like observed precipitation. Spatial two-part models are a flexible class of models for such data. Fitting two-part models can be computationally expensive for large data due to high-dimensional dependent latent variables, costly matrix operations, and slow mixing Markov chains. We describe a flexible, computationally efficient approach for modeling large zero-inflated spatial data using the projection-based intrinsic conditional autoregression (PICAR) framework. We study our approach, which we call PICAR-Z, through extensive simulation studies and two environmental data sets. Our results suggest that PICAR-Z provides accurate predictions while remaining computationally efficient. An important goal of our work is to allow researchers who are not experts in computation to easily build computationally efficient extensions to zero-inflated spatial models; this also allows for a more thorough exploration of modeling choices in two-part models than was previously possible. We show that PICAR-Z is easy to implement and extend in popular probabilistic programming languages such as nimble and stan.

翻译：具有空间相关性的数据常伴随零值过多现象，这类被称为零膨胀空间数据的数据广泛存在于多个学科领域。例如计数数据（如动物物种丰度或缺失情况、疾病计数数据）及半连续数据（如观测降水量）。空间两部分模型是处理此类数据的灵活建模框架。然而，由于高维依赖潜变量、高成本矩阵运算及马尔可夫链混合缓慢等问题，对大规模数据拟合两部分模型需付出较高计算代价。本文基于投影本征条件自回归（PICAR）框架，提出一种灵活且计算高效的零膨胀空间数据建模方法，并将该方法命名为PICAR-Z。通过大规模仿真实验及两个环境数据集验证，结果表明PICAR-Z能在保持计算效率的同时提供精准预测。本研究的重要目标在于，即使非计算领域的研究人员也能轻松构建零膨胀空间模型的计算高效扩展，从而实现对两部分模型中建模选择更全面的探索。我们证明，PICAR-Z可便捷地在主流概率编程语言（如nimble和stan）中实现与扩展。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日