Recent advances in AI, machine learning, and NLP have led to the development of a new generation of Large Language Models (LLMs) that are trained on massive amounts of data and often have trillions of parameters. Commercial applications (e.g., ChatGPT) have made this technology available to the general public, thus making it possible to use LLMs to produce high-quality texts for academic and professional purposes. Schools and universities are aware of the increasing use of AI-generated content by students and they have been researching the impact of this new technology and its potential misuse. Educational programs in Computer Science (CS) and related fields are particularly affected because LLMs are also capable of generating programming code in various programming languages. To help understand the potential impact of publicly available LLMs in CS education, we introduce CSEPrompts, a framework with hundreds of programming exercise prompts and multiple-choice questions retrieved from introductory CS and programming courses. We also provide experimental results on CSEPrompts to evaluate the performance of several LLMs with respect to generating Python code and answering basic computer science and programming questions.
翻译:近期人工智能、机器学习与自然语言处理领域的进展催生了新一代大规模语言模型(LLM),这些模型基于海量数据训练,参数规模常达万亿级别。ChatGPT等商业应用已将此项技术普及至公众,使得学术及专业领域的高质量文本生成成为可能。各院校已注意到学生群体日渐增多的AI生成内容使用现象,持续研究这项新技术的影响及其潜在滥用风险。计算机科学(CS)及相关领域的教育项目尤为受到影响,因为LLM还能生成多种编程语言的程序代码。为帮助理解公开可用的LLM对CS教育的潜在影响,我们提出CSEPrompts框架,该框架包含从计算机科学与编程入门课程中提取的数百条编程练习提示及多项选择题。我们同时提供基于CSEPrompts的实验结果,评估了多个LLM在生成Python代码及解答基础计算机科学问题方面的表现。