Penalizing the nuclear norm of a function's Jacobian encourages it to locally behave like a low-rank linear map. Such functions vary locally along only a handful of directions, making the Jacobian nuclear norm a natural regularizer for machine learning problems. However, this regularizer is intractable for high-dimensional problems, as it requires computing a large Jacobian matrix and taking its singular value decomposition. We show how to efficiently penalize the Jacobian nuclear norm using techniques tailor-made for deep learning. We prove that for functions parametrized as compositions $f = g \circ h$, one may equivalently penalize the average squared Frobenius norm of $Jg$ and $Jh$. We then propose a denoising-style approximation that avoids the Jacobian computations altogether. Our method is simple, efficient, and accurate, enabling Jacobian nuclear norm regularization to scale to high-dimensional deep learning problems. We complement our theory with an empirical study of our regularizer's performance and investigate applications to denoising and representation learning.
翻译:惩罚函数雅可比矩阵的核范数可使其在局部表现出类似低秩线性映射的特性。此类函数仅在少数方向上发生局部变化,使得雅可比核范数成为机器学习问题的天然正则化器。然而,该正则化器在高维问题中难以处理,因其需要计算大型雅可比矩阵并执行奇异值分解。本文展示了如何利用专为深度学习定制的技术高效惩罚雅可比核范数。我们证明对于参数化为复合函数$f = g \circ h$的情形,可等价地惩罚$Jg$与$Jh$的平均平方弗罗贝尼乌斯范数。进而提出一种去噪式近似方法,完全避免雅可比矩阵计算。本方法简洁高效且精确,使雅可比核范数正则化能够扩展至高维深度学习问题。我们通过实证研究验证了该正则化器的性能,并探索了其在去噪与表示学习中的应用。