We introduce semi-parametric inducing point networks (SPIN), a general-purpose architecture that can query the training set at inference time in a compute-efficient manner. Semi-parametric architectures are typically more compact than parametric models, but their computational complexity is often quadratic. In contrast, SPIN attains linear complexity via a cross-attention mechanism between datapoints inspired by inducing point methods. Querying large training sets can be particularly useful in meta-learning, as it unlocks additional training signal, but often exceeds the scaling limits of existing models. We use SPIN as the basis of the Inducing Point Neural Process, a probabilistic model which supports large contexts in meta-learning and achieves high accuracy where existing models fail. In our experiments, SPIN reduces memory requirements, improves accuracy across a range of meta-learning tasks, and improves state-of-the-art performance on an important practical problem, genotype imputation.
翻译:我们提出了半参数归纳点网络(SPIN),这是一种通用架构,能够在推理时以高效计算的方式查询训练集。半参数架构通常比参数模型更紧凑,但其计算复杂度往往呈二次方增长。相比之下,SPIN通过受归纳点方法启发的数据点间交叉注意力机制实现线性复杂度。查询大规模训练集在元学习中尤为有用,因为这能解锁额外的训练信号,但往往超出现有模型的扩展极限。我们将SPIN作为归纳点神经过程的基础,这是一种在元学习中支持大上下文且能在现有模型失效时实现高精度的概率模型。在实验中,SPIN降低了内存需求,提升了各类元学习任务的准确性,并在一个重要实际问题——基因型填补上,取得了当前最优性能。