Can AI Review Improve Paper Drafting? An Empirical Study on 20 Computer Architecture Submissions

Research is advancing faster than ever with artificial intelligence (AI); and so are the corresponding research papers. The exploding volume of AI-generated papers have put a strain to peer review, leading to the usage of AI-generated review, potentially wide yet sneaky. However, relevant ethical concerns about confidentiality, quality, and fairness are raised and no consensus has been reached in the broad research community. We expect the debate to continue for a while, but in the meantime, we ask an alternative, practical question: \textit{can AI review improve paper drafting?} We study 20 computer architecture papers, with varying levels of submission lineage, to expose how well AI review aligns with human review, quantified by a set of metrics we define. To conduct the case study, we build a web UI-integrated tool, \emph{AI-Paper-Review}, that generates structured AI review of a draft paper, available at https://github.com/unarylab/ai-paper-review. This tool selects several AI reviewers from a diverse pool of AI reviewers and clusters and ranks their comments based on commonality and importance of review comments. It also allows to align AI comments with human comments to facilitate metric-based validation. The case study shows that AI review can cover a significant fraction of human-raised issues, but also raises issues missing in human review. This paper is not intended to encourage using AI for peer review at the current stage, but to study that (1) how AI review can improve paper drafting and (2) the potential and limitation of AI-based peer review. The release of the tool and the case study data is intended to instigate future research on this topic. Misuse for peer review would violate the ethics policies from major academic venues.

翻译：人工智能（AI）正以前所未有的速度推动研究进展，随之而来的研究论文数量也在激增。海量AI生成的论文给同行评审带来了压力，导致AI生成的审稿意见被使用——这一做法可能广泛存在却隐蔽。然而，关于保密性、质量与公平性的伦理争议已引发关注，但学术界尚未达成共识。预计这场争论将持续，但在此同时，我们提出一个面向实践的替代性问题：《AI审稿能否改进论文写作？》我们以20篇计算机体系结构论文为研究对象（其投稿经历等级各异），通过自行定义的量化指标，揭示AI审稿与人工审稿的契合程度。为开展案例研究，我们构建了一个集成Web界面的工具《AI-Paper-Review》（https://github.com/unarylab/ai-paper-review），可为论文草稿生成结构化AI审稿意见。该工具从多样化的AI审稿池中选取若干审稿人，依据其评语的共性与重要性进行聚类与排序，并支持将AI评语与人工评语对齐以进行基于指标的验证。案例研究表明，AI审稿能覆盖大部分人工发现的问题，同时也会提出人工审稿中遗漏的问题。本文无意在现阶段鼓励使用AI进行同行评审，而是旨在探究：（1）AI审稿如何改进论文写作；（2）AI基同行评审的潜力与局限。工具及案例研究数据的发布意在激发该领域的后续研究。将本工具滥用于同行评审将违反主要学术机构的伦理政策。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

Nature杂志《AI科学家诞生：从构思到论文发表，全程无需人类插手》

专知会员服务

26+阅读 · 3月28日

AI生成代码缺陷综述

专知会员服务

17+阅读 · 2025年12月8日

《理解决策主体对可竞争人工智能系统的需求和感知》最新262页论文

专知会员服务

28+阅读 · 2025年4月14日

推荐！《人与AI协作中的可解释人工智能》320页论文

专知会员服务

138+阅读 · 2023年7月31日