The present paper introduces concurrency-driven enhancements to the training algorithm for the Kolmogorov-Arnold networks (KANs) that is based on the Newton-Kaczmarz (NK) method. As indicated by prior research, the NK-based training for KANs offers state-of-the-art performance in terms of accuracy and training time on relatively large datasets, significantly overtaking classical neural networks based on multilayer perceptrons (MLPs). Although some elements of the algorithm can be parallelised (in particular, evaluation of the basis functions' values), a major limitation of the algorithm is the sequential application of the parameters' updates, which has not been resolved up to now. However, substantial acceleration is achievable. Three complementary strategies are proposed in the present paper: (i) a pre-training procedure tailored to the NK updates' structure, (ii) training on disjoint subsets of data, followed by models' merging, not in the context of federated learning, but as a mechanism for accelerating the convergence, and (iii) a parallelisation technique suitable for execution on field-programmable gate arrays (FPGAs), which is implemented and tested directly on the device. All experimental results presented in this work are fully reproducible, with the complete source codes available online.
翻译:本文针对基于Newton-Kaczmarz(NK)方法的Kolmogorov-Arnold网络(KANs)训练算法,提出了并发驱动的增强策略。先前研究表明,基于NK的KAN训练在较大规模数据集上实现了精度与训练时间方面的先进性能,显著超越了基于多层感知机(MLP)的传统神经网络。尽管该算法的部分环节可并行化(特别是基函数值的计算),但其核心局限在于参数更新的顺序执行特性,此问题至今尚未解决。然而,显著的加速效果是可实现的。本文提出三种互补策略:(一)针对NK更新结构设计的预训练流程;(二)在非联邦学习背景下,通过对数据不相交子集进行训练并执行模型融合,以此作为加速收敛的机制;(三)适用于现场可编程门阵列(FPGA)的并行化技术,该技术已在设备端直接实现与验证。本工作所有实验结果均具备完全可复现性,完整源代码已在线公开。