The accuracy and complexity of machine learning algorithms based on kernel optimization are determined by the set of kernels over which they are able to optimize. An ideal set of kernels should: admit a linear parameterization (for tractability); be dense in the set of all kernels (for robustness); be universal (for accuracy). Recently, a framework was proposed for using positive matrices to parameterize a class of positive semi-separable kernels. Although this class can be shown to meet all three criteria, previous algorithms for optimization of such kernels were limited to classification and furthermore relied on computationally complex Semidefinite Programming (SDP) algorithms. In this paper, we pose the problem of learning semiseparable kernels as a minimax optimization problem and propose a SVD-QCQP primal-dual algorithm which dramatically reduces the computational complexity as compared with previous SDP-based approaches. Furthermore, we provide an efficient implementation of this algorithm for both classification and regression -- an implementation which enables us to solve problems with 100 features and up to 30,000 datums. Finally, when applied to benchmark data, the algorithm demonstrates the potential for significant improvement in accuracy over typical (but non-convex) approaches such as Neural Nets and Random Forest with similar or better computation time.
翻译:基于核优化的机器学习算法的精度与复杂度,取决于其能够优化的核集合。理想的核集合应具备以下特性:允许线性参数化(以保证可处理性);在所有核集合中稠密(以保证鲁棒性);具有通用性(以保证精度)。近期,有研究提出利用正定矩阵参数化一类正半可分核的框架。尽管该类核可被证明满足全部三项标准,但此前针对此类核的优化算法仅限于分类任务,且依赖于计算复杂度高的半定规划算法。本文提出将学习半可分核的问题表述为极小极大优化问题,并提出一种SVD-QCQP原对偶算法,与先前基于SDP的方法相比,该算法显著降低了计算复杂度。此外,我们为该算法在分类和回归任务中均提供了高效实现方案——该实现使我们能够处理具有100个特征及多达30,000个数据点的问题。最终,在基准数据测试中,该算法展现出相较于典型非凸方法(如神经网络和随机森林)在精度上的显著提升潜力,同时保持相当或更优的计算时间。