Meta-Learning with a Geometry-Adaptive Preconditioner

from arxiv, Accepted at CVPR 2023. Code is available at: https://github.com/Suhyun777/CVPR23-GAP; This is an extended version of our previous CVPR23 work

Model-agnostic meta-learning (MAML) is one of the most successful meta-learning algorithms. It has a bi-level optimization structure where the outer-loop process learns a shared initialization and the inner-loop process optimizes task-specific weights. Although MAML relies on the standard gradient descent in the inner-loop, recent studies have shown that controlling the inner-loop's gradient descent with a meta-learned preconditioner can be beneficial. Existing preconditioners, however, cannot simultaneously adapt in a task-specific and path-dependent way. Additionally, they do not satisfy the Riemannian metric condition, which can enable the steepest descent learning with preconditioned gradient. In this study, we propose Geometry-Adaptive Preconditioned gradient descent (GAP) that can overcome the limitations in MAML; GAP can efficiently meta-learn a preconditioner that is dependent on task-specific parameters, and its preconditioner can be shown to be a Riemannian metric. Thanks to the two properties, the geometry-adaptive preconditioner is effective for improving the inner-loop optimization. Experiment results show that GAP outperforms the state-of-the-art MAML family and preconditioned gradient descent-MAML (PGD-MAML) family in a variety of few-shot learning tasks. Code is available at: https://github.com/Suhyun777/CVPR23-GAP.

翻译：模型无关元学习（MAML）是最成功的元学习算法之一。该算法具有双层优化结构，其中外层循环学习共享初始化参数，内层循环优化任务特定权重。尽管MAML在内层循环中使用标准梯度下降，但近期研究表明，通过元学习预条件器控制内层循环的梯度下降具有显著优势。然而，现有预条件器无法同时以任务特定和路径依赖的方式进行自适应。此外，它们不满足黎曼度量条件，而这可以实现基于预条件梯度的最速下降学习。在本研究中，我们提出几何自适应预条件梯度下降（GAP），该方法能够克服MAML的局限性；GAP可以高效地元学习依赖于任务特定参数的预条件器，且该预条件器可被证明为黎曼度量。借助这两项特性，几何自适应预条件器能有效改善内层循环优化。实验结果表明，在多种小样本学习任务中，GAP的性能优于最先进的MAML系列和预条件梯度下降MAML（PGD-MAML）系列。代码开源地址：https://github.com/Suhyun777/CVPR23-GAP。

相关内容

MAML

关注 42

MAML（Model-Agnostic Meta-Learning）是元学习（Meta learning）最经典的几个算法之一，出自论文《Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks》。原文地址：https://arxiv.org/abs/1703.03400

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日