评估ChatGPT对UML模型的评估能力：对教育的影响 (Assessing UML Models by ChatGPT: Implications for Education)

In software engineering (SE) research and practice, UML is well known as an essential modeling methodology for requirements analysis and software modeling in both academia and industry. In particular, fundamental knowledge of UML modeling and practice in creating high-quality UML models are included in SE-relevant courses in the undergraduate programs of many universities. This leads to a time-consuming and labor-intensive task for educators to review and grade a large number of UML models created by the students. Recent advancements in generative AI techniques, such as ChatGPT, have paved new ways to automate many SE tasks. However, current research or tools seldom explore the capabilities of ChatGPT in evaluating the quality of UML models. This paper aims to investigate the feasibility and effectiveness of ChatGPT in assessing the quality of UML use case diagrams, class diagrams, and sequence diagrams. First, 11 evaluation criteria with grading details were proposed for these UML models. Next, a series of experiments were designed and conducted on 40 students' UML modeling reports to explore the performance of ChatGPT in evaluating and grading these UML diagrams. The research findings reveal that ChatGPT performed well in this assessing task because the scores that ChatGPT gives to the UML models are similar to the ones by human experts, and there are three evaluation discrepancies between ChatGPT and human experts, but varying in different evaluation criteria used in different types of UML models.

翻译：在软件工程（SE）的研究与实践中，UML作为一种重要的建模方法，在学术界和工业界的需求分析与软件建模中广为人知。特别是，UML建模的基础知识以及创建高质量UML模型的实践，被纳入许多大学本科课程中与SE相关的教学内容。这导致教育工作者在审阅和批改学生创建的大量UML模型时，面临耗时且劳动密集的任务。生成式人工智能技术（如ChatGPT）的最新进展，为自动化许多SE任务开辟了新途径。然而，当前的研究或工具很少探索ChatGPT在评估UML模型质量方面的能力。本文旨在研究ChatGPT在评估UML用例图、类图和序列图质量方面的可行性与有效性。首先，针对这些UML模型提出了包含评分细节的11项评估标准。接着，设计并开展了一系列实验，基于40份学生的UML建模报告，探究ChatGPT在评估和评分这些UML图表方面的表现。研究结果表明，ChatGPT在此项评估任务中表现良好，因为ChatGPT为UML模型给出的分数与人类专家的评分相似，且ChatGPT与人类专家之间存在三项评估差异，但这些差异因不同类型的UML模型所采用的评估标准而异。

相关内容

UML

关注 2

统一建模语言（UML，Unified Modeling Language）是由国际软件行业组织 OMG（对象管理集团 http://omg.org）自 1997 年起研发的用于 IT 各领域建模的一套标准、通用、图形化的面向对象（OO）建模语言，对应的国际标准为 ISO/IEC 19505。UML 具有简单、直观、形象、表达力强等特点，因此不仅常用于复杂软件系统架构的建模和面向对象分析与设计（OOAD），也可用于复杂业务流程及系统需求的建模。UML 当前的最新版本为 v2.5（2015.3）。 UML 起源于 3 位著名的软件工程方法学家 Grady Booch、James Rumbaugh、Ivar Jacobson 融合、统一了他们各自原来的建模语言和方法。

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

31+阅读 · 2021年9月29日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日