Implicit models, an emerging model class, compute outputs by iterating a single parameter block to a fixed point. This architecture realizes an infinite-depth, weight-tied network that trains with constant memory, significantly reducing memory needs for the same level of performance compared to explicit models. While it is empirically known that these compact models can often match or even exceed the accuracy of larger explicit networks by allocating more test-time compute, the underlying mechanism remains poorly understood. We study this gap through a nonparametric analysis of expressive power. We provide a strict mathematical characterization, showing that a simple and regular implicit operator can, through iteration, progressively express more complex mappings. We prove that for a broad class of implicit models, this process lets the model's expressive power scale with test-time compute, ultimately matching a much richer function class. The theory is validated across four domains: image reconstruction, scientific computing, operations research, and LLM reasoning, demonstrating that as test-time iterations increase, the complexity of the learned mapping rises, while the solution quality simultaneously improves and stabilizes.
翻译:隐式模型作为新兴模型类别,通过将单一参数块迭代至不动点来计算输出。该架构实现了无限深度、权重绑定的网络,并以恒定内存进行训练,相较于显式模型,在同等性能水平下显著降低了内存需求。尽管经验表明,这些紧凑模型通过分配更多测试时计算量,通常能够匹配甚至超越更大规模显式网络的精度,但其内在机制仍鲜为人知。我们通过表达能力的非参数分析来研究这一差距。我们提供了严格的数学刻画,证明简单而规则的隐式算子能够通过迭代逐步表达更复杂的映射。我们证明,对于一大类隐式模型,这一过程使得模型的表达能力随测试时计算量扩展,最终匹配更丰富的函数类别。该理论在四个领域得到验证:图像重建、科学计算、运筹学与LLM推理,结果表明随着测试时迭代次数增加,所学映射的复杂度上升,同时解的质量同步提升并趋于稳定。