Gaussian processes (GPs) have emerged as a prominent technique for machine learning and signal processing. A key component in GP modeling is the choice of kernel, and linear multiple kernels (LMKs) have become an attractive kernel class due to their powerful modeling capacity and interpretability. This paper focuses on the grid spectral mixture (GSM) kernel, an LMK that can approximate arbitrary stationary kernels. Specifically, we propose a novel GSM kernel formulation for multi-dimensional data that reduces the number of hyper-parameters compared to existing formulations, while also retaining a favorable optimization structure and approximation capability. In addition, to make the large-scale hyper-parameter optimization in the GSM kernel tractable, we first introduce the distributed SCA (DSCA) algorithm. Building on this, we propose the doubly distributed SCA (D$^2$SCA) algorithm based on the alternating direction method of multipliers (ADMM) framework, which allows us to cooperatively learn the GSM kernel in the context of big data while maintaining data privacy. Furthermore, we tackle the inherent communication bandwidth restriction in distributed frameworks, by quantizing the hyper-parameters in D$^2$SCA, resulting in the quantized doubly distributed SCA (QD$^2$SCA) algorithm. Theoretical analysis establishes convergence guarantees for the proposed algorithms, while experiments on diverse datasets demonstrate the superior prediction performance and efficiency of our methods.
翻译:高斯过程(Gaussian processes, GPs)已成为机器学习和信号处理领域的一项重要技术。高斯过程建模中的关键要素是核函数的选择,而线性多核(linear multiple kernels, LMKs)凭借其强大的建模能力与可解释性成为一种颇具吸引力的核类。本文聚焦于网格谱混合(grid spectral mixture, GSM)核——一种能够逼近任意平稳核函数的线性多核。具体而言,我们针对多维数据提出了一种新颖的GSM核函数形式,相较于现有形式,该形式减少了超参数数量,同时保留了良好的优化结构与逼近能力。此外,为使GSM核中的大规模超参数优化问题易于处理,我们首先引入了分布式SCA(DSCA)算法。在此基础上,基于交替方向乘子法(alternating direction method of multipliers, ADMM)框架,提出了双重分布式SCA(D$^2$SCA)算法,该算法允许我们在大数据背景下协作学习GSM核,同时保持数据隐私。进一步地,通过量化D$^2$SCA中的超参数,我们解决了分布式框架中固有的通信带宽限制问题,从而得到量化双重分布式SCA(QD$^2$SCA)算法。理论分析证明了所提算法的收敛性保证,而针对多种数据集的实验则展示了我们方法优越的预测性能与效率。