Recent work has established an alternative to traditional multi-layer perceptron neural networks in the form of Kolmogorov-Arnold Networks (KAN). The general KAN framework uses learnable activation functions on the edges of the computational graph followed by summation on nodes. The learnable edge activation functions in the original implementation are basis spline functions (B-Spline). Here, we present a model in which learnable grids of B-Spline activation functions are replaced by grids of re-weighted sine functions. We show that this leads to better or comparable numerical performance to B-Spline KAN models on the MNIST benchmark, while also providing a substantial speed increase on the order of 4-8 times.
翻译:近期研究提出了一种替代传统多层感知器神经网络的新架构——Kolmogorov-Arnold网络(KAN)。通用KAN框架在计算图的边上使用可学习的激活函数,随后在节点上进行求和运算。原始实现中的可学习边激活函数采用基样条函数(B-Spline)。本文提出一种模型,将可学习的B-Spline激活函数网格替换为重新加权的正弦函数网格。实验表明,在MNIST基准测试中,该模型取得了优于或相当于B-Spline KAN模型的数值性能,同时计算速度显著提升约4-8倍。