Continual relation extraction (CRE) models aim at handling emerging new relations while avoiding catastrophically forgetting old ones in the streaming data. Though improvements have been shown by previous CRE studies, most of them only adopt a vanilla strategy when models first learn representations of new relations. In this work, we point out that there exist two typical biases after training of this vanilla strategy: classifier bias and representation bias, which causes the previous knowledge that the model learned to be shaded. To alleviate those biases, we propose a simple yet effective classifier decomposition framework that splits the last FFN layer into separated previous and current classifiers, so as to maintain previous knowledge and encourage the model to learn more robust representations at this training stage. Experimental results on two standard benchmarks show that our proposed framework consistently outperforms the state-of-the-art CRE models, which indicates that the importance of the first training stage to CRE models may be underestimated. Our code is available at https://github.com/hemingkx/CDec.
翻译:连续关系抽取(CRE)模型旨在处理流式数据中不断涌现的新关系,同时避免灾难性地遗忘旧关系。尽管已有CRE研究取得了改进,但大多数方法在模型首次学习新关系表示时仅采用朴素策略。本研究指出,朴素策略训练后存在两种典型偏差:分类器偏差和表示偏差,这会导致模型先前学到的知识被遮蔽。为缓解这些偏差,我们提出一种简单而有效的分类器分解框架,通过将最后的FFN层拆分为独立的先前分类器和当前分类器,从而在训练阶段维持先前知识并鼓励模型学习更鲁棒的表示。在两个标准基准上的实验结果表明,我们提出的框架持续优于当前最优的CRE模型,这表明初始训练阶段对CRE模型的重要性可能被低估了。我们的代码已开源在https://github.com/hemingkx/CDec。