Linear Mode Connectivity (LMC) refers to the phenomenon that performance remains consistent for linearly interpolated models in the parameter space. For independently optimized model pairs from different random initializations, achieving LMC is considered crucial for validating the stable success of the non-convex optimization in modern machine learning models and for facilitating practical parameter-based operations such as model merging. While LMC has been achieved for neural networks by considering the permutation invariance of neurons in each hidden layer, its attainment for other models remains an open question. In this paper, we first achieve LMC for soft tree ensembles, which are tree-based differentiable models extensively used in practice. We show the necessity of incorporating two invariances: subtree flip invariance and splitting order invariance, which do not exist in neural networks but are inherent to tree architectures, in addition to permutation invariance of trees. Moreover, we demonstrate that it is even possible to exclude such additional invariances while keeping LMC by designing decision list-based tree architectures, where such invariances do not exist by definition. Our findings indicate the significance of accounting for architecture-specific invariances in achieving LMC.
翻译:线性模式连通性(LMC)指的是在参数空间中对线性插值模型进行插值时性能保持稳定的现象。对于来自不同随机初始化的独立优化模型对,实现LMC被认为是验证现代机器学习模型中非凸优化稳定成功的关键,并有助于促进模型合并等基于参数的实用操作。虽然通过考虑每个隐藏层神经元的排列不变性,神经网络已实现了LMC,但其他模型是否能够实现LMC仍是一个开放性问题。在本文中,我们首次在软树集成(一种在实践中广泛使用的基于树的可微分模型)中实现了LMC。我们证明了除了树的排列不变性外,还需要纳入两种不变性:子树翻转不变性和分裂顺序不变性,这两种不变性在神经网络中不存在,但却是树结构固有的。此外,我们证明通过设计基于决策列表的树结构(根据定义此类结构不存在上述不变性),甚至可以在排除这些额外不变性的同时保持LMC。我们的研究结果表明,在实现LMC时考虑特定结构的不变性具有重要意义。