Sparsity of a learning solution is a desirable feature in machine learning. Certain reproducing kernel Banach spaces (RKBSs) are appropriate hypothesis spaces for sparse learning methods. The goal of this paper is to understand what kind of RKBSs can promote sparsity for learning solutions. We consider two typical learning models in an RKBS: the minimum norm interpolation (MNI) problem and the regularization problem. We first establish an explicit representer theorem for solutions of these problems, which represents the extreme points of the solution set by a linear combination of the extreme points of the subdifferential set, of the norm function, which is data-dependent. We then propose sufficient conditions on the RKBS that can transform the explicit representation of the solutions to a sparse kernel representation having fewer terms than the number of the observed data. Under the proposed sufficient conditions, we investigate the role of the regularization parameter on sparsity of the regularized solutions. We further show that two specific RKBSs: the sequence space $\ell_1(\mathbb{N})$ and the measure space can have sparse representer theorems for both MNI and regularization models.
翻译:学习解的稀疏性是机器学习中的一个理想特征。某些再生核巴拿赫空间是稀疏学习方法的合适假设空间。本文旨在探讨何种再生核巴拿赫空间能够促进学习解的稀疏性。我们考虑再生核巴拿赫空间中的两种典型学习模型:最小范数插值问题和正则化问题。首先,我们为这些问题的解建立了一个显式表示定理,该定理将解集的极值点表示为范数函数的次微分集极值点的线性组合,这种表示依赖于数据。随后,我们提出了再生核巴拿赫空间上的充分条件,使得解的这种显式表示可以转化为稀疏核表示,其项数少于观测数据数量。在提出的充分条件下,我们研究了正则化参数对正则化解稀疏性的影响。进一步,我们证明两个具体的再生核巴拿赫空间——序列空间$\ell_1(\mathbb{N})$和测度空间——对于最小范数插值和正则化模型均具有稀疏表示定理。