Monotonic Kolmogorov-Arnold Networks: A Theoretical and Empirical Study of Monotonicity as an Inductive Bias

Monotonicity has been a long-running architectural inductive bias for neural networks, motivated by tabular, scientific, and economic settings where outputs are known to respond monotonically to certain inputs. Existing approaches are MLP- or flow-based and lack per-edge functional transparency; the only Kolmogorov--Arnold Network (KAN) variant with monotonicity, MonoKAN, enforces the constraint only on a restricted parameter subset and requires a projection-style training procedure. We close this gap with \textbf{MKAN}, a KAN with hard monotonicity guaranteed for \emph{all} parameter values via exponential reparameterization of B-spline coefficients, positive edge weights, and a monotone base activation. Training reduces to standard unconstrained gradient descent. Our headline theoretical contribution is a \emph{representation-cost} theorem: any $C^K, K >0$ feature extractor inducing a ball-shaped semantic-neighborhood partition admits a monotone realization of the equivalent neighborhood structure at $N' = N^* + k \le 2N^*$, where $k$ is the number of non-monotone coordinates of the original. The bound is architecture-agnostic and gives a principled sizing rule for monotone encoders. Empirically, MKAN is competitive with state-of-the-art monotone NNs on the SMM/ICML-2024 benchmark while being the only method that combines hard unconstrained monotonicity with KAN's per-edge functional transparency; the $2N^*$ prediction is validated in a self-supervised feature-size sweep on four real datasets, and on a controlled monotone-generative dataset MKAN recovers ground-truth factors with substantially higher Spearman alignment than KAN, MLP, and linear baselines.

翻译：摘要：单调性一直是神经网络中长期存在的架构性归纳偏置，其动机源于表格、科学及经济场景中输出已知对特定输入呈现单调响应。现有方法多基于MLP或流模型，缺乏逐边功能透明性；唯一具有单调性的Kolmogorov-Arnold网络变体MonoKAN仅在受限参数子集上施加约束，并需要投影式训练流程。我们通过**MKAN**填补了这一空白——该KAN通过B样条系数的指数重参数化、正边缘权重及单调基激活函数，对所有参数值实现硬单调性保证。训练简化为标准无约束梯度下降。我们的核心理论贡献是一项**表征成本定理**：任意诱导球状语义邻域划分的$C^K(K>0)$特征提取器，均可实现等价的单调邻域结构，其规模$N'=N^*+k\leq 2N^*$，其中$k$为原始非单调坐标数量。该界限与架构无关，为单调编码器提供了原则性的规模确定规则。实验表明，MKAN在SMM/ICML-2024基准测试中与最先进单调神经网络水平相当，同时是唯一将硬无约束单调性与KAN逐边功能透明性结合的方法；在四个真实数据集上的自监督特征规模扫描验证了$2N^*$预测，而在受控单调生成数据集上，MKAN恢复真实因子的Spearman一致性显著优于KAN、MLP及线性基线。