In the era of rapid development of artificial intelligence, its applications span across diverse fields, relying heavily on effective data processing and model optimization. Combined Regularized Support Vector Machines (CR-SVMs) can effectively handle the structural information among data features, but there is a lack of efficient algorithms in distributed-stored big data. To address this issue, we propose a unified optimization framework based on consensus structure. This framework is not only applicable to various loss functions and combined regularization terms but can also be effectively extended to non-convex regularization terms, showing strong scalability. Based on this framework, we develop a distributed parallel alternating direction method of multipliers (ADMM) algorithm to efficiently compute CR-SVMs when data is stored in a distributed manner. To ensure the convergence of the algorithm, we also introduce the Gaussian back-substitution method. Meanwhile, for the integrity of the paper, we introduce a new model, the sparse group lasso support vector machine (SGL-SVM), and apply it to music information retrieval. Theoretical analysis confirms that the computational complexity of the proposed algorithm is not affected by different regularization terms and loss functions, highlighting the universality of the parallel algorithm. Experiments on synthetic and free music archiv datasets demonstrate the reliability, stability, and efficiency of the algorithm.
翻译:在人工智能快速发展的时代,其应用遍及各个领域,并高度依赖于有效的数据处理和模型优化。组合正则化支持向量机(CR-SVMs)能够有效处理数据特征间的结构信息,但在分布式存储的大数据场景中缺乏高效算法。为解决此问题,我们提出了一个基于共识结构的统一优化框架。该框架不仅适用于多种损失函数和组合正则化项,还能有效扩展到非凸正则化项,展现出强大的可扩展性。基于此框架,我们开发了一种分布式并行交替方向乘子法(ADMM)算法,用于在数据分布式存储时高效计算CR-SVMs。为确保算法收敛性,我们还引入了高斯回代法。同时,为完善论文内容,我们引入了一个新模型——稀疏群套索支持向量机(SGL-SVM),并将其应用于音乐信息检索。理论分析证实,所提算法的计算复杂度不受不同正则化项和损失函数的影响,凸显了该并行算法的普适性。在合成数据集和免费音乐档案数据集上的实验验证了算法的可靠性、稳定性和高效性。