How much have students' ordinary learning processes shifted in response to generative AI, and how does that affect their durable learning outcomes? Self-report surveys show little change, while small-scale behavioral studies report widespread AI use without the scale or duration to measure learning consequences. We address both questions using a ten-year panel of $3.2$ million ALEKS learning interactions for investigating time-on-task, complemented by ALEKS PPL placement-assessment data for examining proctoring and learning outcomes, with a quasi-experimental design exploiting variation in tasks that are more susceptible to AI (text-based word problems) and less susceptible to AI (interactive graph-based problems). Learning time on AI-susceptible problems declines $2.8\%$ per quarter among college students after ChatGPT's release, cumulating to $26.9\%$ over eleven quarters; high-schoolers show $31.3\%$, middle-schoolers $9.0\%$, and Grade 5 students no detectable change. Among college students, the post-ChatGPT divergence vanishes entirely under proctoring, ruling out broad efficiency gains as the likely explanation. Logistic fixed-effects models on randomly assigned proctored retention items yield a $25\%$ cumulative decline in odds of correct response; the same estimator on non-proctored assessment produces a large opposite-signed increase -- inconsistent with any platform, cohort, or curriculum explanation. These results are among the first large-scale behavioral and outcome evidence that generative AI has altered how students study and the knowledge they build -- the population-level indicator of \emph{cognitive surrender}, with direct implications for educational research, assessment governance, and AI policy.
翻译:学生在普通学习过程中对生成式AI的响应程度如何?这又如何影响其持久的学习成果?自我报告调查显示变化甚微,而小规模行为研究虽报告了AI的广泛使用,却因规模与时长不足而无法衡量学习后果。我们利用十年间320万条ALEKS学习交互数据构成的面板,结合ALEKS PPL分班评估数据,采用准实验设计(利用易受AI影响的文本应用题与不易受AI影响的交互式图形题之间的差异),同时对任务时间及监考与学习成果进行考察。ChatGPT发布后,大学生在易受AI影响问题上的学习时间每季度下降2.8%,十一个季度累计达26.9%;高中生下降31.3%,初中生下降9.0%,五年级学生则无显著变化。在大学生群体中,ChatGPT发布后的学习时间差异在监考条件下完全消失,排除了效率提升作为普遍解释的可能性。对随机分配的监考保留题进行的逻辑固定效应模型分析显示,正确回答的几率累计下降25%;而同一估计量在非监考评估中却产生显著反向增长——这与任何平台、同期群或课程解释均不相符。这些结果首次从大规模行为与成果证据层面表明,生成式AI已改变学生的学习方式及其知识构建——这正是认知投降的群体水平指标,对教育研究、评估治理及AI政策具有直接影响。