Human migration exhibits complex spatiotemporal dependence driven by environmental and socioeconomic forces. Modeling such patterns at scale requires methods that accommodate many random effects while remaining feasible when raw data or large design matrices cannot be freely shared across distributed nodes. We develop a communication-efficient inference framework for Varying Coefficient Mixed Models (VCMMs) with flexible mean structures and large correlated random-effect components. Using a Bayesian hierarchical representation of penalized splines, we derive sufficient statistics that preserve each node's likelihood contribution and recover the estimator from the full data under unrestricted communication. Under communication constraints, these statistics support a one-step communication-efficient estimator with first-order efficiency. An SVD-enhanced implementation stabilizes large or ill-conditioned random-effect covariance operators. Theory establishes likelihood preservation, convergence, asymptotic efficiency, and finite-sample concentration. Simulations and U.S. migration-flow data demonstrate accuracy, scalability, and recovery of dynamic spatial patterns.
翻译:人口迁移受环境和社会经济因素驱动,呈现出复杂的时空依赖性。对此类模式进行大规模建模,需要采用能够容纳大量随机效应的方法,同时在原始数据或大型设计矩阵无法在分布式节点间自由共享时仍保持可行性。我们为具有灵活均值结构和大型相关随机效应分量的可变系数混合模型开发了一种通信高效的推理框架。利用惩罚样条的贝叶斯分层表示,我们推导出充分统计量,该统计量能保留每个节点的似然贡献,并在无通信限制下从完整数据中恢复估计量。在通信限制下,这些统计量支持一种具有一阶效率的单步通信高效估计量。一种基于奇异值分解的增强实现可稳定大型或病态随机效应协方差算子。理论建立了似然保持性、收敛性、渐近效率以及有限样本集中性。通过模拟实验和美国迁移流数据,展示了其准确性、可扩展性以及动态空间模式的恢复能力。