Looking for sparsity is nowadays crucial to speed up the training of large-scale neural networks. Projections onto the $\ell_{1,2}$ and $\ell_{1,\infty}$ are among the most efficient techniques to sparsify and reduce the overall cost of neural networks. In this paper, we introduce a new projection algorithm for the $\ell_{1,\infty}$ norm ball. The worst-case time complexity of this algorithm is $\mathcal{O}\big(nm+J\log(nm)\big)$ for a matrix in $\mathbb{R}^{n\times m}$. $J$ is a term that tends to 0 when the sparsity is high, and to $nm$ when the sparsity is low. Its implementation is easy and it is guaranteed to converge to the exact solution in a finite time. Moreover, we propose to incorporate the $\ell_{1,\infty}$ ball projection while training an autoencoder to enforce feature selection and sparsity of the weights. Sparsification appears in the encoder to primarily do feature selection due to our application in biology, where only a very small part ($<2\%$) of the data is relevant. We show that both in the biological case and in the general case of sparsity that our method is the fastest.
翻译:如今,寻找稀疏性对于加速大规模神经网络训练至关重要。投影到$\ell_{1,2}$和$\ell_{1,\infty}$球是稀疏化并降低神经网络整体成本的最有效技术之一。本文针对$\ell_{1,\infty}$范数球提出了一种新的投影算法。该算法对于$\mathbb{R}^{n\times m}$矩阵的最坏情况时间复杂度为$\mathcal{O}\big(nm+J\log(nm)\big)$,其中$J$在稀疏度高时趋近于0,在稀疏度低时趋近于$nm$。该算法易于实现,并保证在有限时间内收敛到精确解。此外,我们建议在训练自编码器时引入$\ell_{1,\infty}$球投影,以强制实现特征选择和权重稀疏化。由于在生物学应用场景中,数据中仅极小部分(<2%)是相关的,因此编码器中主要利用稀疏化进行特征选择。我们表明,无论是在生物学案例还是在一般稀疏性案例中,我们的方法均是最快的。