Gaussian Process Regression (GPR) is a powerful and elegant method for learning complex functions from noisy data with a wide range of applications, including in safety-critical domains. Such applications have two key features: (i) they require rigorous error quantification, and (ii) the noise is often bounded and non-Gaussian due to, e.g., physical constraints. While error bounds for applying GPR in the presence of non-Gaussian noise exist, they tend to be overly restrictive and conservative in practice. In this paper, we provide novel error bounds for GPR under bounded support noise. Specifically, by relying on concentration inequalities and assuming that the latent function has low complexity in the reproducing kernel Hilbert space (RKHS) corresponding to the GP kernel, we derive both probabilistic and deterministic bounds on the error of the GPR. We show that these errors are substantially tighter than existing state-of-the-art bounds and are particularly well-suited for GPR with neural network kernels, i.e., Deep Kernel Learning (DKL). Furthermore, motivated by applications in safety-critical domains, we illustrate how these bounds can be combined with stochastic barrier functions to successfully quantify the safety probability of an unknown dynamical system from finite data. We validate the efficacy of our approach through several benchmarks and comparisons against existing bounds. The results show that our bounds are consistently smaller, and that DKLs can produce error bounds tighter than sample noise, significantly improving the safety probability of control systems.
翻译:高斯过程回归(GPR)是一种强大而优雅的方法,用于从含噪声数据中学习复杂函数,其应用范围广泛,包括安全关键领域。此类应用具有两个关键特征:(i)需要严格的误差量化;(ii)由于物理约束等原因,噪声通常是有界的且非高斯的。虽然存在针对非高斯噪声下应用GPR的误差界,但在实践中这些误差界往往过于严格和保守。本文针对有界支撑噪声下的GPR提出了新颖的误差界。具体而言,通过依赖集中不等式并假设潜在函数在对应于GP核的再生核希尔伯特空间(RKHS)中具有低复杂度,我们推导出了GPR误差的概率界和确定性界。我们证明这些误差界显著优于现有最先进的误差界,并且特别适用于具有神经网络核的GPR,即深度核学习(DKL)。此外,受安全关键领域应用的启发,我们阐述了如何将这些误差界与随机屏障函数相结合,以成功地从有限数据中量化未知动态系统的安全概率。我们通过多个基准测试以及与现有误差界的比较验证了我们方法的有效性。结果表明,我们的误差界始终更小,并且DKL能够产生比采样噪声更紧的误差界,从而显著提高了控制系统的安全概率。