In computer science education, test cases are an integral part of programming assignments since they can be used as assessment items to test students' programming knowledge and provide personalized feedback on student-written code. The goal of our work is to propose a fully automated approach for test case generation that can accurately measure student knowledge, which is important for two reasons. First, manually constructing test cases requires expert knowledge and is a labor-intensive process. Second, developing test cases for students, especially those who are novice programmers, is significantly different from those oriented toward professional-level software developers. Therefore, we need an automated process for test case generation to assess student knowledge and provide feedback. In this work, we propose a large language model-based approach to automatically generate test cases and show that they are good measures of student knowledge, using a publicly available dataset that contains student-written Java code. We also discuss future research directions centered on using test cases to help students.
翻译:在计算机科学教育中,测试用例是编程作业的重要组成部分,因为它们既可作为评估工具来检验学生的编程知识,又能为学生编写的代码提供个性化反馈。本研究旨在提出一种完全自动化的测试用例生成方法,能够准确衡量学生的知识掌握程度,这一目标具有双重重要意义。首先,手工构建测试用例既需要专业知识又耗费大量人力。其次,针对学生(尤其是编程新手)的测试用例开发与面向专业级软件开发人员的测试用例存在显著差异。因此,我们需要自动化的测试用例生成流程来评估学生知识并提供反馈。本研究提出基于大语言模型的方法自动生成测试用例,并通过包含学生编写的Java代码的公开数据集,证明这些测试用例能有效衡量学生的知识水平。最后,我们探讨了以测试用例辅助学生为核心的未来研究方向。