The Digital Divide in Generative AI: Evidence from Large Language Model Use in College Admissions Essays

Large language models (LLMs) have become popular writing tools among students and may expand access to high-quality feedback for students with less access to traditional writing support. At the same time, LLMs may standardize student voice or invite overreliance. This study examines how adoption of LLM-assisted writing varies across socioeconomic groups and how it relates to outcomes in a high-stakes context: U.S. college admissions. We analyze a de-identified longitudinal dataset of applications to a selective university from 2020 to 2024 (N = 81,663). Estimating LLM use using a distribution-based detector trained on synthetic and historical essays, we tracked how student writing changed as LLM use proliferated, how adoption differed by socioeconomic status (SES), and whether potential benefits translated equitably into admissions outcomes. Using fee-waiver status as a proxy for SES, we observe post-2023 convergence in surface-level linguistic features, with the largest changes in fee-waived and rejected applicants. Estimated LLM use rose sharply in 2024 across all groups, with disproportionately larger increases among lower SES applicants, consistent with an access hypothesis in which LLMs substitute for scarce writing support. However, increased estimated LLM use was more strongly associated with declines in predicted admission probability for lower SES applicants than for higher SES applicants, even after controlling for academic credentials and stylometric features. These findings raise concerns about equity and the validity of essay-based evaluation in an era of AI-assisted writing and provide the first large-scale longitudinal evidence linking LLM adoption, linguistic change, and evaluative outcomes in college admissions.

翻译：大型语言模型（LLM）已成为学生中流行的写作工具，可能为那些难以获得传统写作支持的学生提供获取高质量反馈的途径。与此同时，LLM也可能导致学生写作风格趋同或引发过度依赖。本研究探讨了在高等教育招生这一高风险情境中，LLM辅助写作的采用如何因社会经济群体而异，及其与录取结果的关系。我们分析了一个2020年至2024年间向一所选拔性大学提交的匿名纵向申请数据集（N = 81,663）。通过使用基于分布、在合成与历史文书上训练的检测器来估计LLM使用情况，我们追踪了随着LLM使用的普及，学生写作如何变化，不同社会经济地位（SES）群体的采用差异，以及潜在益处是否公平地转化为录取结果。以申请费豁免状态作为SES的代理变量，我们观察到2023年后表层语言特征的趋同，其中费用豁免申请者和被拒申请者的变化最为显著。2024年所有群体的估计LLM使用率均急剧上升，低SES申请者的增长幅度尤为突出，这与“接入假说”一致，即LLM替代了稀缺的写作支持资源。然而，即使在控制了学业成绩和文体计量特征后，估计LLM使用率的增加与低SES申请者录取预测概率下降的关联，仍比高SES申请者更为强烈。这些发现引发了关于AI辅助写作时代中公平性及文书评估有效性的担忧，并首次提供了大规模纵向证据，揭示了大学招生中LLM采用、语言变化与评估结果之间的关联。