We introduce a novel writing method called Probing Chain of Thought (ProCoT), which prevents students from cheating using a Large Language Model (LLM), such as ChatGPT, while enhancing their active learning through such models. LLMs have disrupted education and many other feilds. For fear of students cheating, many educationists have resorted to banning their use, as their outputs can be human-like and hard to detect in some cases. These LLMs are also known for hallucinations (i.e. fake facts). We conduct studies with ProCoT in two different courses with a combined total of about 66 students. The students in each course were asked to prompt an LLM of their choice with one question from a set of four and required to affirm or refute statements in the LLM output by using peer reviewed references. The results show two things: (1) ProCoT stimulates creative/critical thinking and writing of students through engagement with LLMs when we compare the LLM solely output to ProCoT output and (2) ProCoT can prevent cheating because of clear limitations in existing LLMs when we compare students ProCoT output to LLM ProCoT output. We also discover that most students prefer to give answers in fewer words than LLMs, which are typically verbose. The average word counts for students, ChatGPT (v3.5) and Phind (v8) are 208, 391 and 383, respectively.
翻译:我们提出一种名为“探针式思维链”(Probing Chain of Thought, ProCoT)的新型写作方法,该方法在防止学生利用ChatGPT等大语言模型作弊的同时,通过此类模型增强其主动学习。LLMs已对教育及诸多领域造成颠覆性影响。为防范学生作弊,许多教育工作者倾向于禁止其使用,因为这些模型的输出有时与人类无异且难以检测。此外,这些LLMs常存在“幻觉”(即生成虚假事实)问题。我们在两门课程中共计约66名学生中开展了ProCoT研究。每门课程的学生被要求选择一组四个问题中的一个,自主选择LLM进行提问,并需通过同行评审文献来验证或反驳LLM输出中的陈述。研究结果揭示了两点:(1)将LLM的单独输出与ProCoT输出对比表明,ProCoT通过引导学生与LLM互动,激发了其创造性/批判性思维和写作能力;(2)将学生的ProCoT输出与LLM的ProCoT输出对比发现,现有LLM存在明显局限性,因此ProCoT可有效防止作弊。我们还发现,多数学生倾向于用比LLM更少的词汇作答(LLM的输出通常冗余)。学生、ChatGPT(v3.5)及Phind(v8)的平均词数分别为208、391和383。