Deep neural networks are typically trained by uniformly sampling large datasets across epochs, despite evidence that not all samples contribute equally throughout learning. Recent work shows that progressively reducing the amount of training data can improve efficiency and generalization, but existing methods rely on fixed schedules that do not adapt during training. In this work, we propose Adaptive Data Dropout, a simple framework that dynamically adjusts the subset of training data based on performance feedback. Inspired by self-regulated learning, our approach treats data selection as an adaptive process, increasing or decreasing data exposure in response to changes in training accuracy. We introduce a lightweight stochastic update mechanism that modulates the dropout schedule online, allowing the model to balance exploration and consolidation over time. Experiments on standard image classification benchmarks show that our method reduces effective training steps while maintaining competitive accuracy compared to static data dropout strategies. These results highlight adaptive data selection as a promising direction for efficient and robust training. Code will be released.
翻译:深度神经网络通常通过跨训练周期均匀采样大型数据集进行训练,尽管已有证据表明并非所有样本在整个学习过程中贡献相同。近期研究表明,逐步减少训练数据量可提升效率与泛化能力,但现有方法依赖训练过程中无法调整的固定调度策略。本文提出自适应数据丢弃框架,该框架基于性能反馈动态调整训练数据子集。受自调节学习启发,本方法将数据选择视为自适应过程,根据训练精度的变化动态增减数据暴露量。我们引入一种轻量级随机更新机制,在线调节丢弃调度策略,使模型能够随时间权衡探索与巩固。标准图像分类基准实验表明,与静态数据丢弃策略相比,本方法在保持竞争性精度的同时减少了有效训练步数。这些结果凸显了自适应数据选择作为高效稳健训练方向的潜力。代码将开源。