Self-Distillation for Gaussian Process Regression and Classification

We propose two approaches to extend the notion of knowledge distillation to Gaussian Process Regression (GPR) and Gaussian Process Classification (GPC); data-centric and distribution-centric. The data-centric approach resembles most current distillation techniques for machine learning, and refits a model on deterministic predictions from the teacher, while the distribution-centric approach, re-uses the full probabilistic posterior for the next iteration. By analyzing the properties of these approaches, we show that the data-centric approach for GPR closely relates to known results for self-distillation of kernel ridge regression and that the distribution-centric approach for GPR corresponds to ordinary GPR with a very particular choice of hyperparameters. Furthermore, we demonstrate that the distribution-centric approach for GPC approximately corresponds to data duplication and a particular scaling of the covariance and that the data-centric approach for GPC requires redefining the model from a Binomial likelihood to a continuous Bernoulli likelihood to be well-specified. To the best of our knowledge, our proposed approaches are the first to formulate knowledge distillation specifically for Gaussian Process models.

翻译：我们提出两种将知识蒸馏概念拓展至高斯过程回归（Gaussian Process Regression, GPR）与高斯过程分类（Gaussian Process Classification, GPC）的方法：数据中心方法与分布中心方法。数据中心方法类似于当前机器学习中多数蒸馏技术，通过教师模型的确定性预测重新拟合模型；而分布中心方法则在下一迭代中复用完整的概率后验分布。通过分析这些方法的性质，我们证明：GPR的数据中心方法与核岭回归自蒸馏的已知结论紧密相关，而GPR的分布中心方法对应于采用特定超参数选择的普通高斯过程回归。此外，我们证明GPC的分布中心方法近似等价于数据复制与协方差的特定缩放操作，且GPC的数据中心方法需要将模型从二项似然重定义为连续伯努利似然以确保模型设定的合理性。据我们所知，我们提出的方法是首个专门针对高斯过程模型的知识蒸馏形式化框架。

相关内容

高斯过程

关注 6

高斯过程（Gaussian Process, GP）是概率论和数理统计中随机过程（stochastic process）的一种，是一系列服从正态分布的随机变量（random variable）在一指数集（index set）内的组合。高斯过程中任意随机变量的线性组合都服从正态分布，每个有限维分布都是联合正态分布，且其本身在连续指数集上的概率密度函数即是所有随机变量的高斯测度，因此被视为联合正态分布的无限维广义延伸。高斯过程由其数学期望和协方差函数完全决定，并继承了正态分布的诸多性质

【干货书】工程和科学中的概率和统计，

专知会员服务

58+阅读 · 2022年12月24日

KDD 2022 | GraphMAE:自监督掩码图自编码器

专知会员服务

20+阅读 · 2022年7月14日

[ICLR2022]PU learning（Positive and Unlabeled learning）任务的mixup方法

专知会员服务

19+阅读 · 2022年2月2日

INRIA 最新《机器学习理论》课程笔记，176页pdf

专知会员服务

52+阅读 · 2020年12月14日