Historically, the machine learning community has derived spectral decompositions from graph-based approaches. We break with this approach and prove the statistical and computational superiority of the Galerkin method, which consists in restricting the study to a small set of test functions. In particular, we introduce implementation tricks to deal with differential operators in large dimensions with structured kernels. Finally, we extend on the core principles beyond our approach to apply them to non-linear spaces of functions, such as the ones parameterized by deep neural networks, through loss-based optimization procedures.
翻译:历史上,机器学习领域一直从基于图的方法中推导谱分解。我们打破这一传统,证明了Galerkin方法在统计与计算上的优越性——该方法将研究限制在一小部分测试函数上。特别地,我们引入了处理具有结构化核的大维微分算子的实现技巧。最后,我们将方法的核心原理扩展至非线性函数空间(如深度神经网络参数化空间),并通过基于损失的优化过程加以应用。