This paper proposes a novel paradigm for machine learning that moves beyond traditional parameter optimization. Unlike conventional approaches that search for optimal parameters within a fixed geometric space, our core idea is to treat the model itself as a malleable geometric entity. Specifically, we optimize the metric tensor field on a manifold with a predefined topology, thereby dynamically shaping the geometric structure of the model space. To achieve this, we construct a variational framework whose loss function carefully balances data fidelity against the intrinsic geometric complexity of the manifold. The former ensures the model effectively explains observed data, while the latter acts as a regularizer, penalizing overly curved or irregular geometries to encourage simpler models and prevent overfitting. To address the computational challenges of this infinite-dimensional optimization problem, we introduce a practical method based on discrete differential geometry: the continuous manifold is discretized into a triangular mesh, and the metric tensor is parameterized by edge lengths, enabling efficient optimization using automatic differentiation tools. Theoretical analysis reveals a profound analogy between our framework and the Einstein-Hilbert action in general relativity, providing an elegant physical interpretation for the concept of "data-driven geometry". We further argue that even with fixed topology, metric optimization offers significantly greater expressive power than models with fixed geometry. This work lays a solid foundation for constructing fully dynamic "meta-learners" capable of autonomously evolving their geometry and topology, and it points to broad application prospects in areas such as scientific model discovery and robust representation learning.
翻译:本文提出了一种超越传统参数优化的机器学习新范式。与在固定几何空间中搜索最优参数的传统方法不同,我们的核心思想是将模型本身视为可塑的几何实体。具体而言,我们在具有预定拓扑结构的流形上优化度量张量场,从而动态塑造模型空间的几何结构。为实现这一目标,我们构建了一个变分框架,其损失函数在数据保真度与流形的内在几何复杂性之间进行精细平衡。前者确保模型有效解释观测数据,而后者作为正则化项,惩罚过度弯曲或不规则的几何结构,以鼓励更简单的模型并防止过拟合。为应对这一无限维优化问题的计算挑战,我们引入了一种基于离散微分几何的实用方法:将连续流形离散化为三角网格,并通过边长参数化度量张量,从而利用自动微分工具实现高效优化。理论分析揭示了我们的框架与广义相对论中爱因斯坦-希尔伯特作用量之间的深刻类比,为“数据驱动几何”概念提供了优雅的物理解释。我们进一步论证,即使在固定拓扑条件下,度量优化也比固定几何模型提供显著更强的表达能力。这项工作为构建能够自主演化几何与拓扑结构的完全动态“元学习器”奠定了坚实基础,并指出了在科学模型发现与鲁棒表示学习等领域的广阔应用前景。