Deep learning requires large amounts of data to learn new tasks well, limiting its applicability to domains where such data is available. Meta-learning overcomes this limitation by learning how to learn. In 2001, Hochreiter et al. showed that an LSTM trained with backpropagation across different tasks is capable of meta-learning. Despite promising results of this approach on small problems, and more recently, also on reinforcement learning problems, the approach has received little attention in the supervised few-shot learning setting. We revisit this approach and test it on modern few-shot learning benchmarks. We find that LSTM, surprisingly, outperform the popular meta-learning technique MAML on a simple few-shot sine wave regression benchmark, but that LSTM, expectedly, fall short on more complex few-shot image classification benchmarks. We identify two potential causes and propose a new method called Outer Product LSTM (OP-LSTM) that resolves these issues and displays substantial performance gains over the plain LSTM. Compared to popular meta-learning baselines, OP-LSTM yields competitive performance on within-domain few-shot image classification, and performs better in cross-domain settings by 0.5% to 1.9% in accuracy score. While these results alone do not set a new state-of-the-art, the advances of OP-LSTM are orthogonal to other advances in the field of meta-learning, yield new insights in how LSTM work in image classification, allowing for a whole range of new research directions. For reproducibility purposes, we publish all our research code publicly.
翻译:深度学习需要大量数据才能很好地学习新任务,这限制了其在数据充足的领域中的应用。元学习通过学会如何学习克服了这一局限。2001年,Hochreiter等人证明,通过跨不同任务的反向传播训练的LSTM能够进行元学习。尽管该方法在小规模问题上取得了令人鼓舞的结果,并且最近在强化学习问题上也有类似表现,但它在监督式小样本学习场景中却很少受到关注。我们重新审视了这种方法,并在现代小样本学习基准上进行了测试。我们惊讶地发现,在简单的小样本正弦波回归基准上,LSTM的表现优于流行的元学习技术MAML;但不出所料,在更复杂的小样本图像分类基准上,LSTM表现不佳。我们识别了两个潜在原因,并提出了一种名为外积LSTM(OP-LSTM)的新方法,该方法解决了这些问题,并在普通LSTM基础上实现了显著的性能提升。与流行的元学习基线相比,OP-LSTM在域内小样本图像分类中具有竞争力,并在跨域设置中实现了0.5%至1.9%的准确率提升。虽然仅凭这些结果尚未达到新的最先进水平,但OP-LSTM的进展与元学习领域的其他进展正交,为理解LSTM在图像分类中的工作机制提供了新见解,从而开辟了广阔的新研究方向。为便于复现,我们公开了所有研究代码。