The automated code evaluation system (AES) is mainly designed to reliably assess user-submitted code. Due to their extensive range of applications and the accumulation of valuable resources, AESs are becoming increasingly popular. Research on the application of AES and their real-world resource exploration for diverse coding tasks is still lacking. In this study, we conducted a comprehensive survey on AESs and their resources. This survey explores the application areas of AESs, available resources, and resource utilization for coding tasks. AESs are categorized into programming contests, programming learning and education, recruitment, online compilers, and additional modules, depending on their application. We explore the available datasets and other resources of these systems for research, analysis, and coding tasks. Moreover, we provide an overview of machine learning-driven coding tasks, such as bug detection, code review, comprehension, refactoring, search, representation, and repair. These tasks are performed using real-life datasets. In addition, we briefly discuss the Aizu Online Judge platform as a real example of an AES from the perspectives of system design (hardware and software), operation (competition and education), and research. This is due to the scalability of the AOJ platform (programming education, competitions, and practice), open internal features (hardware and software), attention from the research community, open source data (e.g., solution codes and submission documents), and transparency. We also analyze the overall performance of this system and the perceived challenges over the years.
翻译:自动化代码评测系统(AES)主要旨在可靠地评估用户提交的代码。由于其广泛的应用场景及宝贵资源的积累,AES日益受到关注。然而,针对AES在不同编码任务中的实际应用及其资源探索的系统性研究仍显不足。本研究对AES及其相关资源进行了全面综述。我们探索了AES的应用领域、可用资源及其在编码任务中的利用方式。根据应用场景,AES被划分为编程竞赛、编程学习与教育、招聘、在线编译器及附加模块等类别。我们梳理了这些系统中可用于研究、分析和编码任务的数据集及其他资源。此外,本文概述了基于机器学习的编码任务——包括缺陷检测、代码审查、代码理解、重构、搜索、表征与修复——这些任务均基于真实数据集开展。同时,为体现AES的系统设计(硬件与软件)、运行(竞赛与教育)及研究特性,我们以Aizu Online Judge平台为例进行简要讨论。AOJ平台具备可扩展性(涵盖编程教育、竞赛与实践)、开放的内部特征(硬件与软件)、研究社区关注度、开源数据(如解决方案代码与提交文档)及透明度优势。最后,我们分析了该系统的整体性能表现及其长期面临的挑战。