Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"), KANs have learnable activation functions on edges ("weights"). KANs have no linear weights at all -- every weight parameter is replaced by a univariate function parametrized as a spline. We show that this seemingly simple change makes KANs outperform MLPs in terms of accuracy and interpretability. For accuracy, much smaller KANs can achieve comparable or better accuracy than much larger MLPs in data fitting and PDE solving. Theoretically and empirically, KANs possess faster neural scaling laws than MLPs. For interpretability, KANs can be intuitively visualized and can easily interact with human users. Through two examples in mathematics and physics, KANs are shown to be useful collaborators helping scientists (re)discover mathematical and physical laws. In summary, KANs are promising alternatives for MLPs, opening opportunities for further improving today's deep learning models which rely heavily on MLPs.
翻译:受柯尔莫哥洛夫-阿诺德表示定理的启发,我们提出柯尔莫哥洛夫-阿诺德网络(KANs),作为多层感知机(MLPs)的有前景替代方案。MLP在节点(“神经元”)上具有固定的激活函数,而KAN在边(“权重”)上具有可学习的激活函数。KAN完全没有线性权重——每个权重参数都被替换为以样条函数参数化的单变量函数。我们表明,这一看似简单的改变使得KAN在准确性和可解释性方面优于MLP。在准确性方面,更小的KAN可以在数据拟合和偏微分方程求解中达到与更大MLP相当或更好的精度。理论上和实证上,KAN具有比MLP更快的神经缩放定律。在可解释性方面,KAN可以直观地可视化,并易于与人类用户交互。通过数学和物理学中的两个示例,KAN被证明是帮助科学家(重新)发现数学和物理定律的有用合作者。总之,KAN是MLP的有前景替代方案,为进一步改进当今严重依赖MLP的深度学习模型开辟了机会。