Complementary Advantages of ChatGPTs and Human Readers in Reasoning: Evidence from English Text Reading Comprehension

ChatGPT has shown its great power in text processing, including its reasoning ability from text reading. However, there has not been any direct comparison between human readers and ChatGPT in reasoning ability related to text reading. This study was undertaken to investigate how ChatGPTs (i.e., ChatGPT and ChatGPT Plus) and Chinese senior school students as ESL learners exhibited their reasoning ability from English narrative texts. Additionally, we compared the two ChatGPTs in the reasoning performances when commands were updated elaborately. The whole study was composed of three reasoning tests: Test 1 for commonsense inference, Test 2 for emotional inference, and Test 3 for causal inference. The results showed that in Test 1, the students outdid the two ChatGPT versions in local-culture-related inferences but performed worse than the chatbots in daily-life inferences. In Test 2, ChatGPT Plus excelled whereas ChatGPT lagged behind in accuracy. In association with both accuracy and frequency of correct responses, the students were inferior to the two chatbots. Compared with ChatGPTs' better performance in positive emotions, the students showed their superiority in inferring negative emotions. In Test 3, the students demonstrated better logical analysis, outdoing both chatbots. In updating command condition, ChatGPT Plus displayed good causal reasoning ability while ChatGPT kept unchanged. Our study reveals that human readers and ChatGPTs have their respective advantages and disadvantages in drawing inferences from text reading comprehension, unlocking a complementary relationship in text-based reasoning.

翻译：ChatGPT在文本处理中展现出强大能力，包括基于文本阅读的推理能力。然而，目前尚无研究直接比较人类读者与ChatGPT在文本阅读相关推理能力上的差异。本研究旨在探究ChatGPT（即ChatGPT与ChatGPT Plus）以及作为英语学习者的中国高中生如何通过英语叙事文本展现其推理能力。此外，我们比较了在指令精细更新条件下两种ChatGPT的推理表现。整个研究包含三项推理测试：测试1（常识推理）、测试2（情感推理）和测试3（因果推理）。结果表明：在测试1中，学生在本地文化相关推理上优于两种ChatGPT，但在日常生活推理上表现不及聊天机器人；测试2中，ChatGPT Plus的准确性显著优于ChatGPT，而学生在正确响应的准确率和频率上均逊于两种聊天机器人；与ChatGPT在积极情感推理上的优势相比，学生在消极情感推理上更具优势；测试3中，学生展现出更优的逻辑分析能力，表现优于两种聊天机器人。在指令更新条件下，ChatGPT Plus展现出良好的因果推理能力，而ChatGPT则表现稳定。本研究表明，人类读者与ChatGPT在文本阅读理解推理中各有优劣，揭示了文本推理中的互补关系。

相关内容

ChatGPT

关注 258

ChatGPT（全名：Chat Generative Pre-trained Transformer），美国OpenAI 研发的聊天机器人程序 [1] ，于2022年11月30日发布。ChatGPT是人工智能技术驱动的自然语言处理工具，它能够通过学习和理解人类的语言来进行对话，还能根据聊天的上下文进行互动，真正像人类一样来聊天交流，甚至能完成撰写邮件、视频脚本、文案、翻译、代码，写论文任务。 [1] https://openai.com/blog/chatgpt/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Connections between Support Vector Machines, Wasserstein distance and gradient-penalty GANs

专知会员服务

36+阅读 · 2019年10月17日