AI代码在野外：衡量现代软件中AI生成代码的安全风险与生态变迁 (AI Code in the Wild: Measuring Security Risks and Ecosystem Shifts of AI-Generated Code in Modern Software)

Large language models (LLMs) for code generation are becoming integral to modern software development, but their real-world prevalence and security impact remain poorly understood. We present the first large-scale empirical study of AI-generated code (AIGCode) in the wild. We build a high-precision detection pipeline and a representative benchmark to distinguish AIGCode from human-written code, and apply them to (i) development commits from the top 1,000 GitHub repositories (2022-2025) and (ii) 7,000+ recent CVE-linked code changes. This lets us label commits, files, and functions along a human/AI axis and trace how AIGCode moves through projects and vulnerability life cycles. Our measurements show three ecological patterns. First, AIGCode is already a substantial fraction of new code, but adoption is structured: AI concentrates in glue code, tests, refactoring, documentation, and other boilerplate, while core logic and security-critical configurations remain mostly human-written. Second, adoption has security consequences: some CWE families are overrepresented in AI-tagged code, and near-identical insecure templates recur across unrelated projects, suggesting "AI-induced vulnerabilities" propagated by shared models rather than shared maintainers. Third, in human-AI edit chains, AI introduces high-throughput changes while humans act as security gatekeepers; when review is shallow, AI-introduced defects persist longer, remain exposed on network-accessible surfaces, and spread to more files and repositories. We will open-source the complete dataset and release analysis artifacts and fine-grained documentation of our methodology and findings.

翻译：用于代码生成的大语言模型（LLMs）正逐渐成为现代软件开发不可或缺的部分，但其在现实世界中的普及程度与安全影响仍鲜为人知。我们首次对野外环境中的AI生成代码（AIGCode）进行了大规模实证研究。我们构建了一个高精度检测流程与代表性基准，以区分AIGCode与人工编写代码，并将其应用于：（i）GitHub前1,000个代码库（2022-2025年）的开发提交记录，以及（ii）7,000余项近期与CVE关联的代码变更。这使得我们能够沿人工/AI维度对提交记录、文件及函数进行标注，并追踪AIGCode在项目与漏洞生命周期中的流转路径。我们的测量结果揭示了三种生态模式。首先，AIGCode已占据新增代码的相当比例，但其采用具有结构性特征：AI代码集中出现在胶水代码、测试、重构、文档及其他样板代码中，而核心逻辑与安全关键配置仍主要由人工编写。其次，AI代码的采用具有安全影响：部分CWE类别在AI标注代码中呈现过高比例，且近乎相同的不安全模板在无关项目中反复出现，这表明存在由共享模型（而非共享维护者）传播的“AI诱导漏洞”。第三，在人机协同编辑链中，AI引入高吞吐量变更，而人类则扮演安全守门员角色；当代码审查流于表面时，AI引入的缺陷会持续更长时间，持续暴露于网络可访问表面，并扩散至更多文件与代码库。我们将开源完整数据集，并发布分析工具及研究方法与发现细粒度文档。