Dealing with an unbounded data stream involves overcoming the assumption that data is identically distributed and independent. A data stream can, in fact, exhibit temporal dependencies (i.e., be a time series), and data can change distribution over time (concept drift). The two problems are deeply discussed, and existing solutions address them separately: a joint solution is absent. In addition, learning multiple concepts implies remembering the past (a.k.a. avoiding catastrophic forgetting in Neural Networks' terminology). This work proposes Continuous Progressive Neural Networks (cPNN), a solution that tames concept drifts, handles temporal dependencies, and bypasses catastrophic forgetting. cPNN is a continuous version of Progressive Neural Networks, a methodology for remembering old concepts and transferring past knowledge to fit the new concepts quickly. We base our method on Recurrent Neural Networks and exploit the Stochastic Gradient Descent applied to data streams with temporal dependencies. Results of an ablation study show a quick adaptation of cPNN to new concepts and robustness to drifts.
翻译:处理无界数据流需要克服数据独立同分布的假设。事实上,数据流可能呈现时间依赖性(即时序序列特性),且数据分布会随时间变化(概念漂移)。这两个问题已得到深入讨论,现有解决方案均单独处理它们:目前尚缺乏联合解决方案。此外,学习多个概念意味着需要保留历史信息(在神经网络术语中称为避免灾难性遗忘)。本研究提出连续渐进神经网络(cPNN),该方案能够驯服概念漂移、处理时间依赖性并规避灾难性遗忘。cPNN是渐进神经网络的连续版本,该方法通过记忆旧概念并迁移历史知识来实现对新概念的快速拟合。我们的方法基于循环神经网络,并利用随机梯度下降算法处理具有时间依赖性的数据流。消融实验结果表明,cPNN能快速适应新概念并对漂移具有鲁棒性。