The Kolmogorov-Arnold Network (KAN) is a new network architecture known for its high accuracy in several tasks such as function fitting and PDE solving. The superior expressive capability of KAN arises from the Kolmogorov-Arnold representation theorem and learnable spline functions. However, the computation of spline functions involves multiple iterations, which renders KAN significantly slower than MLP, thereby increasing the cost associated with model training and deployment. The authors of KAN have also noted that ``the biggest bottleneck of KANs lies in its slow training. KANs are usually 10x slower than MLPs, given the same number of parameters.'' To address this issue, we propose a novel MLP-type neural network PowerMLP that employs simpler non-iterative spline function representation, offering approximately the same training time as MLP while theoretically demonstrating stronger expressive power than KAN. Furthermore, we compare the FLOPs of KAN and PowerMLP, quantifying the faster computation speed of PowerMLP. Our comprehensive experiments demonstrate that PowerMLP generally achieves higher accuracy and a training speed about 40 times faster than KAN in various tasks.
翻译:Kolmogorov-Arnold网络(KAN)是一种新型网络架构,以其在函数拟合和偏微分方程求解等多项任务中的高精度而闻名。KAN卓越的表达能力源于Kolmogorov-Arnold表示定理和可学习的样条函数。然而,样条函数的计算涉及多次迭代,这使得KAN明显慢于多层感知机(MLP),从而增加了模型训练和部署的成本。KAN的作者也指出:“KAN的最大瓶颈在于其训练速度缓慢。在参数量相同的情况下,KAN通常比MLP慢10倍。”为解决这一问题,我们提出了一种新型MLP类神经网络PowerMLP,它采用更简单的非迭代样条函数表示,在理论上展现出比KAN更强的表达能力,同时训练时间与MLP大致相当。此外,我们比较了KAN与PowerMLP的浮点运算次数(FLOPs),量化了PowerMLP更快的计算速度。我们的综合实验表明,在各种任务中,PowerMLP通常能达到更高的精度,且训练速度比KAN快约40倍。