When feedback is absorbed faster than task structure can be evaluated, the learner will favor feedback over truth. A two-timescale model shows this feedback-truth gap is inevitable whenever the two rates differ and vanishes only when they match. We test this prediction across neural networks trained with noisy labels (30 datasets, 2,700 runs), human probabilistic reversal learning (N = 292), and human reward/punishment learning with concurrent EEG (N = 25). In each system, truth is defined operationally: held-out labels, the objectively correct option, or the participant's pre-feedback expectation - the only non-circular reference decodable from post-feedback EEG. The gap appeared universally but was regulated differently: dense networks accumulated it as memorization; sparse-residual scaffolding suppressed it; humans generated transient over-commitment that was actively recovered. Neural over-commitment (~0.04-0.10) was amplified tenfold into behavioral commitment (d = 3.3-3.9). The gap is a fundamental constraint on learning under noisy supervision; its consequences depend on the regulation each system employs.
翻译:当反馈吸收速度超过任务结构评估速度时,学习者将更倾向于依赖反馈而非真相。一个双时间尺度模型表明,只要这两种速率存在差异,反馈-真相差距就不可避免,仅当二者匹配时才会消失。我们在带噪声标签训练的神经网络(30个数据集,2,700次运行)、人类概率反转学习(N = 292)以及同步采集脑电图的人类奖赏/惩罚学习(N = 25)中验证了这一预测。在每个系统中,真相均被操作化定义:留出标签、客观正确选项或参与者反馈前的预期——后者是唯一可从反馈后脑电图中解码的非循环参照。该差距在所有系统中普遍存在,但受不同机制调控:稠密网络通过记忆化积累该差距;稀疏残差架构抑制该差距;人类则产生瞬态过度承诺并通过主动机制恢复。神经层面的过度承诺(约0.04-0.10)被放大十倍转化为行为承诺(d = 3.3-3.9)。该差距是噪声监督下学习的基本约束条件,其具体影响取决于各系统采用的调控机制。