Class-Incremental Learning (CIL) is a practical and challenging problem for achieving general artificial intelligence. Recently, Pre-Trained Models (PTMs) have led to breakthroughs in both visual and natural language processing tasks. Despite recent studies showing PTMs' potential ability to learn sequentially, a plethora of work indicates the necessity of alleviating the catastrophic forgetting of PTMs. Through a pilot study and a causal analysis of CIL, we reveal that the crux lies in the imbalanced causal effects between new and old data. Specifically, the new data encourage models to adapt to new classes while hindering the adaptation of old classes. Similarly, the old data encourages models to adapt to old classes while hindering the adaptation of new classes. In other words, the adaptation process between new and old classes conflicts from the causal perspective. To alleviate this problem, we propose Balancing the Causal Effects (BaCE) in CIL. Concretely, BaCE proposes two objectives for building causal paths from both new and old data to the prediction of new and classes, respectively. In this way, the model is encouraged to adapt to all classes with causal effects from both new and old data and thus alleviates the causal imbalance problem. We conduct extensive experiments on continual image classification, continual text classification, and continual named entity recognition. Empirical results show that BaCE outperforms a series of CIL methods on different tasks and settings.
翻译:类增量学习(CIL)是实现通用人工智能的一个现实且富有挑战性的问题。近年来,预训练模型(PTMs)在视觉和自然语言处理任务中均取得了突破性进展。尽管近期研究表明PTMs具备顺序学习的能力,但大量工作指出缓解PTMs的灾难性遗忘仍必不可少。通过一项先导性研究及对CIL的因果分析,我们发现问题的关键在于新旧数据之间因果效应的不均衡。具体而言,新数据促使模型适配新类别,却阻碍其对旧类别的适配;同理,旧数据促进模型适配旧类别,却阻碍其对新类别的适配。换言之,从因果视角看,新旧类别之间的适配过程存在冲突。为缓解该问题,我们提出在CIL中平衡因果效应(BaCE)。具体地,BaCE设计两个目标函数,分别构建从新旧数据到新旧类别预测的因果路径。通过这种方式,模型被鼓励利用新旧数据的因果效应适配所有类别,从而缓解因果不均衡问题。我们在持续图像分类、持续文本分类和持续命名实体识别任务上进行了大量实验。实证结果表明,BaCE在不同任务和设置下均优于一系列CIL方法。