Argument Reconstruction as Supervision for Critical Thinking in LLMs

To think critically about arguments, human learners are trained to identify, reconstruct, and evaluate arguments. Argument reconstruction is especially important because it makes an argument's underlying inferences explicit. However, it remains unclear whether LLMs can similarly enhance their critical thinking ability by learning to reconstruct arguments. To address this question, we introduce a holistic framework with three contributions. We (1) propose an engine that automatically reconstructs arbitrary arguments (GAAR), (2) synthesize a new high-quality argument reconstruction dataset (Arguinas) using the GAAR engine, and (3) investigate whether learning argument reconstruction benefits downstream critical thinking tasks. Our experimental results show that, across seven critical thinking tasks, models trained to learn argument reconstruction outperform models that do not, with the largest performance gains observed when training on the proposed Arguinas dataset. The source code and dataset will be publicly available.

翻译：为对论据进行批判性思考，人类学习者被训练去识别、重构和评估论据。论据重构尤为重要，因为它使论据的潜在推理过程变得显式。然而，目前尚不清楚大语言模型是否能够通过学会重构论据来类似地提升其批判性思维能力。为探究此问题，我们提出了一个包含三项贡献的整体框架。我们（1）提出了一个能自动重构任意论据的引擎（GAAR），（2）利用该GAAR引擎合成一个新的高质量论据重构数据集（Arguinas），以及（3）研究学习论据重构是否有利于下游的批判性思维任务。我们的实验结果表明，在七项批判性思维任务中，经过论据重构训练的模型均优于未经此训练的模型，其中，在提出的Arguinas数据集上进行训练时观察到了最大的性能提升。源代码和数据集将公开提供。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

LLMS4ALL：大语言模型在各学科科研与应用中的综述

专知会员服务

36+阅读 · 2025年10月4日

大语言模型评估技术研究进展

专知会员服务

48+阅读 · 2024年7月9日

【COLING教程】导航现代评估领域：大语言模型 (LLMs) 基准和框架的考量，181页ppt

专知会员服务

28+阅读 · 2024年5月31日

158页《大型语言模型数据集》全面综述，444个数据集涵盖预训练、指令微调、偏好、评估等，附中英文版

专知会员服务

155+阅读 · 2024年3月1日