Deep learning models have become increasingly computationally intensive, requiring extensive computational resources and time for both training and inference. A significant contributing factor to this challenge is the uniform computational effort expended on each input example, regardless of its complexity. We introduce \textbf{DynaLay}, an alternative architecture that features a decision-making agent to adaptively select the most suitable layers for processing each input, thereby endowing the model with a remarkable level of introspection. DynaLay reevaluates more complex inputs during inference, adjusting the computational effort to optimize both performance and efficiency. The core of the system is a main model equipped with Fixed-Point Iterative (FPI) layers, capable of accurately approximating complex functions, paired with an agent that chooses these layers or a direct action based on the introspection of the models inner state. The model invests more time in processing harder examples, while minimal computation is required for easier ones. This introspective approach is a step toward developing deep learning models that "think" and "ponder", rather than "ballistically'' produce answers. Our experiments demonstrate that DynaLay achieves accuracy comparable to conventional deep models while significantly reducing computational demands.
翻译:深度学习模型日益计算密集,在训练和推理过程中均需消耗大量计算资源与时间。导致这一难题的重要因素在于:无论输入样本的复杂程度如何,模型对其均需投入均等的计算量。我们提出**DynaLay**——一种替代性架构,其核心是一个决策主体,能够自适应地为每个输入选择最合适的处理层,从而赋予模型显著的反思能力。DynaLay在推理过程中对更复杂的输入进行重新评估,通过动态调整计算量来优化性能与效率。该系统的核心由一个配备定点迭代(FPI)层的主模型组成——该模型能精确逼近复杂函数——并配有一个主体,该主体基于对模型内部状态的反思来选择这些层或执行直接操作。模型对较难的样本投入更多处理时间,而对简单样本则仅需最少计算量。这种反思性方法是迈向开发能够“思考”与“深思”而非“弹道式”生成答案的深度学习模型的重要一步。实验表明,DynaLay在显著降低计算需求的同时,能达到与传统深度模型相当的准确率。