As the use of Large Language Models (LLMs) in text generation tasks proliferates, concerns arise over their potential to compromise academic integrity. The education sector currently tussles with distinguishing student-authored homework assignments from AI-generated ones. This paper addresses the challenge by introducing HowkGPT, designed to identify homework assignments generated by AI. HowkGPT is built upon a dataset of academic assignments and accompanying metadata [17] and employs a pretrained LLM to compute perplexity scores for student-authored and ChatGPT-generated responses. These scores then assist in establishing a threshold for discerning the origin of a submitted assignment. Given the specificity and contextual nature of academic work, HowkGPT further refines its analysis by defining category-specific thresholds derived from the metadata, enhancing the precision of the detection. This study emphasizes the critical need for effective strategies to uphold academic integrity amidst the growing influence of LLMs and provides an approach to ensuring fair and accurate grading in educational institutions.
翻译:随着大型语言模型(LLMs)在文本生成任务中的广泛应用,其对学术诚信可能带来的威胁日益引发关注。教育领域目前正面临区分学生原创作业与AI生成作业的挑战。本文通过引入HowkGPT系统来应对这一难题,该系统专用于识别AI生成的作业。HowkGPT基于包含学术作业及其元数据的公开数据集[17],并利用预训练LLM计算学生原创回答与ChatGPT生成回答的困惑度分数。这些分数随后用于建立判定提交作业来源的阈值。考虑到学术工作的专业性与上下文依赖性,HowkGPT进一步通过元数据定义学科特定阈值来优化分析,从而提高检测精度。本研究强调了在LLM影响力不断扩大的背景下,制定有效策略维护学术诚信的迫切性,并为教育机构实现公平准确的评分提供了可行方案。