Gaussian processes (GPs) are sophisticated distributions to model functional data. Whilst theoretically appealing, they are computationally cumbersome except for small datasets. We implement two methods for scaling GP inference in Stan: First, a general sparse approximation using a directed acyclic dependency graph. Second, a fast, exact method for regularly spaced data modeled by GPs with stationary kernels using the fast Fourier transform. Based on benchmark experiments, we offer guidance for practitioners to decide between different methods and parameterizations. We consider two real-world examples to illustrate the package. The implementation follows Stan's design and exposes performant inference through a familiar interface. Full posterior inference for ten thousand data points is feasible on a laptop in less than 20 seconds.
翻译:高斯过程(GP)是用于建模函数数据的复杂分布。尽管在理论上具有吸引力,但除小规模数据集外,其计算过程繁琐耗时。我们在Stan中实现了两种扩展GP推断的方法:其一,基于有向无环依赖图的通用稀疏近似方法;其二,针对平稳核函数建模的规则间隔数据,采用快速傅里叶变换的快速精确方法。通过基准实验,我们为实践者选择不同方法与参数化方案提供指导。通过两个现实世界案例展示该软件包的功能。该实现遵循Stan的设计理念,通过熟悉的接口实现高性能推断。在笔记本电脑上,对一万个数据点进行完整后验推断的时间不超过20秒。