The accuracy and complexity of machine learning algorithms based on kernel optimization are determined by the set of kernels over which they are able to optimize. An ideal set of kernels should: admit a linear parameterization (for tractability); be dense in the set of all kernels (for robustness); be universal (for accuracy). Recently, a framework was proposed for using positive matrices to parameterize a class of positive semi-separable kernels. Although this class can be shown to meet all three criteria, previous algorithms for optimization of such kernels were limited to classification and furthermore relied on computationally complex Semidefinite Programming (SDP) algorithms. In this paper, we pose the problem of learning semiseparable kernels as a minimax optimization problem and propose a SVD-QCQP primal-dual algorithm which dramatically reduces the computational complexity as compared with previous SDP-based approaches. Furthermore, we provide an efficient implementation of this algorithm for both classification and regression -- an implementation which enables us to solve problems with 100 features and up to 30,000 datums. Finally, when applied to benchmark data, the algorithm demonstrates the potential for significant improvement in accuracy over typical (but non-convex) approaches such as Neural Nets and Random Forest with similar or better computation time.
翻译:基于核优化的机器学习算法的准确性与复杂度取决于其能够优化的核集合。理想的核集合应具备以下特性:允许线性参数化(以保证可解性);在全体核集合中稠密(以保证鲁棒性);具备通用性(以保证准确性)。近期,有研究提出利用正矩阵对一类正半可分核进行参数化。尽管此类核已被证明满足上述全部三项准则,但现有优化算法仅局限于分类任务,且依赖计算复杂度极高的半定规划(SDP)算法。本文将半可分核的学习问题建模为极小极大优化问题,并提出一种SVD-QCQP原对偶算法,相较基于SDP的方法大幅降低了计算复杂度。此外,我们针对分类与回归任务提供了该算法的高效实现——该实现可处理含100个特征、多达30,000个样本的问题。最后,在基准数据集上的实验表明,该算法在相似或更优的计算时间内,相较于典型的非凸方法(如神经网络和随机森林)展现出显著的精度提升潜力。