Gaussian Processes (GPs) provide a flexible and statistically principled foundation for modelling spatiotemporal phenomena, but their $O(N^3)$ scaling makes them intractable for large datasets. Approximate methods such as variational inference (VI), inducing-point (sparse) GPs, low-rank kernel approximations (e.g., Nystrom methods and random Fourier features), and approximations such as INLA improve scalability but typically trade off accuracy, calibration, or modelling flexibility. We introduce DeepRV, a neural-network surrogate that replaces GP prior sampling, while closely matching full GP accuracy at inference including hyperparameter estimates, and reducing computational complexity to $O(N^2)$, increasing scalability and inference speed. DeepRV serves as a drop-in replacement for GP prior realisations in e.g. MCMC-based probabilistic programming pipelines, preserving full model flexibility. Across simulated benchmarks, non-separable spatiotemporal GPs, and a real-world application to education deprivation in London (n = 4,994 locations), DeepRV achieves the highest fidelity to exact GPs while substantially accelerating inference. Code is provided in the dl4bi Python package, with all experiments run on a single consumer-grade GPU to ensure accessibility for practitioners.
翻译:高斯过程为建模时空现象提供了灵活且具有统计原则的基础,但其$O(N^3)$的复杂度使其在处理大规模数据集时难以处理。变分推断、诱导点稀疏高斯过程、低秩核近似(如Nyström方法和随机傅里叶特征)以及INLA等近似方法虽提升了可扩展性,但通常以牺牲精度、校准性或建模灵活性为代价。我们提出DeepRV,这是一种神经网络替代模型,它取代了高斯过程先验采样,同时在推断(包括超参数估计)中紧密匹配完整高斯过程的精度,并将计算复杂度降低至$O(N^2)$,从而提升了可扩展性和推断速度。例如,DeepRV可作为高斯过程先验实现的即插即用替代品,用于基于MCMC的概率编程流程中,保持完整的模型灵活性。在模拟基准测试、不可分离的时空高斯过程以及伦敦教育贫困(n=4,994个地点)的实际应用中,DeepRV在实现与精确高斯过程最高保真度的同时,显著加速了推断。代码以dl4bi Python包的形式提供,所有实验均在单块消费级GPU上运行,以确保从业者易于使用。