Inspired by the Kolmogorov-Arnold representation theorem, we propose Kolmogorov-Arnold Networks (KANs) as promising alternatives to Multi-Layer Perceptrons (MLPs). While MLPs have fixed activation functions on nodes ("neurons"), KANs have learnable activation functions on edges ("weights"). KANs have no linear weights at all -- every weight parameter is replaced by a univariate function parametrized as a spline. We show that this seemingly simple change makes KANs outperform MLPs in terms of accuracy and interpretability. For accuracy, much smaller KANs can achieve comparable or better accuracy than much larger MLPs in data fitting and PDE solving. Theoretically and empirically, KANs possess faster neural scaling laws than MLPs. For interpretability, KANs can be intuitively visualized and can easily interact with human users. Through two examples in mathematics and physics, KANs are shown to be useful collaborators helping scientists (re)discover mathematical and physical laws. In summary, KANs are promising alternatives for MLPs, opening opportunities for further improving today's deep learning models which rely heavily on MLPs.
翻译:受 Kolmogorov-Arnold 表示定理启发,我们提出 Kolmogorov-Arnold 网络(KANs)作为多层感知机(MLPs)的有潜力的替代方案。MLP 在节点(“神经元”)上使用固定的激活函数,而 KAN 在边(“权重”)上使用可学习的激活函数。KAN 完全不包含线性权重——每个权重参数都被替换为以样条参数化的单变量函数。我们证明,这一看似简单的改变使 KAN 在准确性和可解释性方面均优于 MLP。在准确性上,在数据拟合和偏微分方程求解任务中,规模小得多的 KAN 可以达到与规模大得多的 MLP 相当或更好的精度。从理论和实验上,KAN 都具有比 MLP 更快的神经标度律。在可解释性方面,KAN 可以直观地被可视化,并易于与人类用户交互。通过数学和物理中的两个示例,我们展示了 KAN 能够作为有用的协作工具,帮助科学家(重新)发现数学与物理定律。总之,KAN 是 MLP 有前景的替代方案,为改进当前严重依赖 MLP 的深度学习模型提供了新的机遇。