Neural processes are a family of models which use neural networks to directly parametrise a map from data sets to predictions. Directly parametrising this map enables the use of expressive neural networks in small-data problems where neural networks would traditionally overfit. Neural processes can produce well-calibrated uncertainties, effectively deal with missing data, and are simple to train. These properties make this family of models appealing for a breadth of applications areas, such as healthcare or environmental sciences. This thesis advances neural processes in three ways. First, we propose convolutional neural processes (ConvNPs). ConvNPs improve data efficiency of neural processes by building in a symmetry called translation equivariance. ConvNPs rely on convolutional neural networks rather than multi-layer perceptrons. Second, we propose Gaussian neural processes (GNPs). GNPs directly parametrise dependencies in the predictions of a neural process. Current approaches to modelling dependencies in the predictions depend on a latent variable, which consequently requires approximate inference, undermining the simplicity of the approach. Third, we propose autoregressive conditional neural processes (AR CNPs). AR CNPs train a neural process without any modifications to the model or training procedure and, at test time, roll out the model in an autoregressive fashion. AR CNPs equip the neural process framework with a new knob where modelling complexity and computational expense at training time can be traded for computational expense at test time. In addition to methodological advancements, this thesis also proposes a software abstraction that enables a compositional approach to implementing neural processes. This approach allows the user to rapidly explore the space of neural process models by putting together elementary building blocks in different ways.
翻译:神经过程是一类利用神经网络直接参数化从数据集到预测的映射的模型。直接参数化该映射使得在传统神经网络容易过拟合的小数据问题中能够使用表达能力强的神经网络。神经过程能够产生校准良好的不确定性估计,有效处理缺失数据,且训练简单。这些特性使得该模型族在医疗健康或环境科学等广泛的应用领域中具有吸引力。本论文从三个方面推进了神经过程的研究。首先,我们提出了卷积神经过程(ConvNPs)。ConvNPs通过引入平移等变性对称性来提升神经过程的数据效率。ConvNPs依赖卷积神经网络而非多层感知机。其次,我们提出了高斯神经过程(GNPs)。GNPs直接参数化神经过程预测中的依赖关系。现有建模预测依赖关系的方法依赖于隐变量,因而需要近似推断,这削弱了方法的简洁性。第三,我们提出了自回归条件神经过程(AR CNPs)。AR CNPs在无需修改模型或训练流程的情况下训练神经过程,并在测试时以自回归方式展开模型。AR CNPs为神经过程框架提供了一个新的调节维度,使得建模复杂度和训练时的计算开销可以与测试时的计算开销进行权衡。除了方法学上的进展,本论文还提出了一种软件抽象,支持以组合方式实现神经过程。该方法允许用户通过不同方式组合基本构建模块,快速探索神经过程模型的设计空间。