Scientific writing is a challenging task, particularly for novice researchers who often rely on feedback from experienced peers. Recent work has primarily focused on improving surface form and style rather than manuscript content. In this paper, we propose a novel task: automated focused feedback generation for scientific writing assistance. We present SWIF$^{2}$T: a Scientific WrIting Focused Feedback Tool. It is designed to generate specific, actionable and coherent comments, which identify weaknesses in a scientific paper and/or propose revisions to it. Our approach consists of four components - planner, investigator, reviewer and controller - leveraging multiple Large Language Models (LLMs) to implement them. We compile a dataset of 300 peer reviews citing weaknesses in scientific papers and conduct human evaluation. The results demonstrate the superiority in specificity, reading comprehension, and overall helpfulness of SWIF$^{2}$T's feedback compared to other approaches. In our analysis, we also identified cases where automatically generated reviews were judged better than human ones, suggesting opportunities for integration of AI-generated feedback in scientific writing.
翻译:科学写作是一项具有挑战性的任务,对于通常依赖经验丰富同行反馈的新手研究者而言尤其如此。近期工作主要集中于改进表面形式和风格,而非手稿内容。本文提出一项新颖任务:面向科学写作辅助的自动化聚焦反馈生成。我们提出了SWIF$^{2}$T:一种科学写作聚焦反馈工具。该工具旨在生成具体、可操作且连贯的评语,以识别科学论文中的弱点并/或提出修改建议。我们的方法包含四个组件——规划器、调查器、评审器和控制器——通过利用多个大型语言模型来实现它们。我们编制了一个包含300份指出科学论文弱点的同行评审数据集,并进行了人工评估。结果表明,与其他方法相比,SWIF$^{2}$T反馈在具体性、阅读理解能力和整体帮助性方面均表现出优越性。在我们的分析中,我们还发现了自动生成评审在某些情况下被认为优于人工评审的案例,这提示了将AI生成反馈整合到科学写作中的机遇。