Towards Lifelong Few-Shot Customization of Text-to-Image Diffusion

Lifelong few-shot customization for text-to-image diffusion aims to continually generalize existing models for new tasks with minimal data while preserving old knowledge. Current customization diffusion models excel in few-shot tasks but struggle with catastrophic forgetting problems in lifelong generations. In this study, we identify and categorize the catastrophic forgetting problems into two folds: relevant concepts forgetting and previous concepts forgetting. To address these challenges, we first devise a data-free knowledge distillation strategy to tackle relevant concepts forgetting. Unlike existing methods that rely on additional real data or offline replay of original concept data, our approach enables on-the-fly knowledge distillation to retain the previous concepts while learning new ones, without accessing any previous data. Second, we develop an In-Context Generation (ICGen) paradigm that allows the diffusion model to be conditioned upon the input vision context, which facilitates the few-shot generation and mitigates the issue of previous concepts forgetting. Extensive experiments show that the proposed Lifelong Few-Shot Diffusion (LFS-Diffusion) method can produce high-quality and accurate images while maintaining previously learned knowledge.

翻译：文本到图像扩散模型的终身少样本定制化旨在以最少的数据持续泛化现有模型以应对新任务，同时保留旧有知识。当前的定制化扩散模型在少样本任务上表现出色，但在终身生成过程中难以克服灾难性遗忘问题。本研究将灾难性遗忘问题识别并归类为两个方面：相关概念遗忘与先前概念遗忘。为应对这些挑战，我们首先设计了一种无需数据的知识蒸馏策略以解决相关概念遗忘问题。与依赖额外真实数据或原始概念数据离线回放的现有方法不同，我们的方法能够实现即时知识蒸馏，在学习新概念的同时保留先前概念，且无需访问任何历史数据。其次，我们开发了一种上下文生成范式，使扩散模型能够以输入视觉上下文为条件进行生成，这既促进了少样本生成，也缓解了先前概念遗忘问题。大量实验表明，所提出的终身少样本扩散方法能够生成高质量且准确的图像，同时保持先前习得的知识。

相关内容

小样本学习

关注 216

小样本学习（Few-Shot Learning，以下简称 FSL ）用于解决当可用的数据量比较少时，如何提升神经网络的性能。在 FSL 中，经常用到的一类方法被称为 Meta-learning。和普通的神经网络的训练方法一样，Meta-learning 也包含训练过程和测试过程，但是它的训练过程被称作 Meta-training 和 Meta-testing。

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日