Referring to the solution programs written by the other users is helpful for learners in programming education. However, current online judge systems just list all solution programs submitted by users for references, and the programs are sorted based on the submission date and time, execution time, or user rating, ignoring to what extent the program can be a reference. In addition, users struggle to refer to a variety of solution approaches since there are too many duplicated and near-duplicated programs. To motivate the learners to refer to various solutions to learn the better solution approaches, in this paper, we propose an approach to deduplicate and rank common solution programs in each programming problem. Based on the hypothesis that the more duplicated programs adopt a more common approach and can be a reference, we remove the near-duplicated solution programs and rank the unique programs based on the duplicate count. The experiments on the solution programs submitted to a real-world online judge system demonstrate that the number of programs is reduced by 60.20%, whereas the baseline only reduces by 29.59% after the deduplication, meaning that the users only need to refer to 39.80% of programs on average. Furthermore, our analysis shows that top-10 ranked programs cover 29.95% of programs on average, indicating that the users can grasp 29.95% of solution approaches by referring to only 10 programs. The proposed approach shows the potential of reducing the learners' burden of referring to too many solutions and motivating them to learn a variety of better approaches.
翻译:参考其他用户编写的解题程序有助于编程教育中的学习者。然而,当前的在线评测系统仅列出用户提交的所有解题程序以供参考,这些程序基于提交时间、执行时间或用户评分排序,忽略了程序作为参考的适用程度。此外,由于存在大量重复及近似重复的程序,用户难以参考多种解题方法。为了激励学习者参考多样化的解决方案以学习更优的解题方法,本文提出了一种针对每个编程问题的高频解题程序去重与排序方法。基于重复次数越多的程序采用的方法越常见且越适合作为参考的假设,我们移除近似重复的解题程序,并根据重复次数对唯一程序进行排序。在真实在线评测系统的解题程序上进行的实验表明,去重后程序数量减少了60.20%,而基线方法仅减少29.59%,这意味着用户平均只需参考39.80%的程序。此外,我们的分析显示,排名前10的程序平均覆盖了29.95%的解题方法,表明用户仅需参考10个程序即可掌握29.95%的解题思路。该方法展现了减少学习者参考过多解题方案负担的潜力,并激励他们学习多样化的更优方法。