We consider estimation under scenarios where the signals of interest exhibit change of characteristics over time. In particular, we consider the continual learning problem where different tasks, e.g., data with different distributions, arrive sequentially and the aim is to perform well on the newly arrived task without performance degradation on the previously seen tasks. In contrast to the continual learning literature focusing on the centralized setting, we investigate the problem from a distributed estimation perspective. We consider the well-established distributed learning algorithm COCOA, which distributes the model parameters and the corresponding features over the network. We provide exact analytical characterization for the generalization error of COCOA under continual learning for linear regression in a range of scenarios, where overparameterization is of particular interest. These analytical results characterize how the generalization error depends on the network structure, the task similarity and the number of tasks, and show how these dependencies are intertwined. In particular, our results show that the generalization error can be significantly reduced by adjusting the network size, where the most favorable network size depends on task similarity and the number of tasks. We present numerical results verifying the theoretical analysis and illustrate the continual learning performance of COCOA with a digit classification task.
翻译:我们研究了在信号特征随时间变化场景下的估计问题。具体而言,我们关注连续学习问题,其中不同任务(例如具有不同分布的数据)按顺序到达,目标是在新任务上表现良好的同时不降低之前任务的性能。与聚焦于集中式场景的连续学习文献不同,我们从分布式估计的角度探讨该问题。我们考虑成熟的分布式学习算法COCOA,该算法将模型参数及对应特征分布到网络节点上。针对线性回归在连续学习中的多种场景(特别关注过参数化情况),我们给出了COCOA泛化误差的精确解析表征。这些解析结果揭示了泛化误差如何依赖于网络结构、任务相似度及任务数量,并展示了这些依赖关系的交织性。特别地,我们的结果表明通过调整网络规模可显著降低泛化误差,而最优网络规模取决于任务相似度与任务数量。我们通过数字分类任务验证了理论分析,并展示了COCOA的连续学习性能。