Online multi-task learning (OMTL) enhances streaming data processing by leveraging the inherent relations among multiple tasks. It can be described as an optimization problem in which a single loss function is defined for multiple tasks. Existing gradient-descent-based methods for this problem might suffer from gradient vanishing and poor conditioning issues. Furthermore, the centralized setting hinders their application to online parallel optimization, which is vital to big data analytics. Therefore, this study proposes a novel OMTL framework based on the alternating direction multiplier method (ADMM), a recent breakthrough in optimization suitable for the distributed computing environment because of its decomposable and easy-to-implement nature. The relations among multiple tasks are modeled dynamically to fit the constant changes in an online scenario. In a classical distributed computing architecture with a central server, the proposed OMTL algorithm with the ADMM optimizer outperforms SGD-based approaches in terms of accuracy and efficiency. Because the central server might become a bottleneck when the data scale grows, we further tailor the algorithm to a decentralized setting, so that each node can work by only exchanging information with local neighbors. Experimental results on a synthetic and several real-world datasets demonstrate the efficiency of our methods.
翻译:在线多任务学习(OMTL)通过利用多个任务之间的内在关联来增强流式数据处理能力。该问题可表述为一个优化问题,其中为多个任务定义了统一的损失函数。针对此问题的现有基于梯度下降的方法可能面临梯度消失和病态条件等问题。此外,集中式架构限制了其应用于在线并行优化——这对大数据分析至关重要。因此,本研究提出了一种基于交替方向乘子法(ADMM)的新型OMTL框架,ADMM作为优化领域的最新突破,因其可分解性和易于实现的特性,特别适用于分布式计算环境。多个任务间的关系被动态建模以适应在线场景中的持续变化。在经典的中心服务器分布式计算架构中,采用ADMM优化器的OMTL算法在准确性和效率方面均优于基于随机梯度下降(SGD)的方法。考虑到数据规模增长时中心服务器可能成为性能瓶颈,我们进一步将该算法适配至去中心化架构,使得每个节点仅需与局部邻居交换信息即可独立工作。在合成数据集及多个真实数据集上的实验结果验证了所提方法的有效性。