While large language models (LLMs) offer promising capabilities for automating academic workflows, existing systems for academic peer review remain constrained by text-only inputs, limited contextual grounding, and a lack of actionable feedback. In this work, we present an interactive web-based system for multimodal, community-aware peer review simulation to enable effective manuscript revisions before paper submission. Our framework integrates textual and visual information through multimodal LLMs, enhances review quality via retrieval-augmented generation (RAG) grounded in web-scale OpenReview data, and converts generated reviews into actionable to-do lists using the proposed Action:Objective[\#] format, providing structured and traceable guidance. The system integrates seamlessly into existing academic writing platforms, providing interactive interfaces for real-time feedback and revision tracking. Experimental results highlight the effectiveness of the proposed system in generating more comprehensive and useful reviews aligned with expert standards, surpassing ablated baselines and advancing transparent, human-centered scholarly assistance.
翻译:尽管大型语言模型(LLM)为学术工作流程自动化提供了前景广阔的能力,但现有的学术同行评审系统仍受限于纯文本输入、有限的上下文基础以及缺乏可操作的反馈。本研究提出了一种基于网络的多模态、社区感知的交互式同行评审模拟系统,旨在支持论文提交前进行有效的稿件修订。该框架通过多模态LLM整合文本与视觉信息,利用基于网络规模OpenReview数据的检索增强生成(RAG)技术提升评审质量,并采用提出的Action:Objective[\#]格式将生成的评审意见转化为可执行的待办事项列表,从而提供结构化且可追溯的指导。该系统可无缝集成到现有学术写作平台中,通过交互式界面实现实时反馈与修订追踪。实验结果表明,所提系统在生成更全面、更实用且符合专家标准的评审意见方面效果显著,优于消融实验基线,并推动了透明化、以人为本的学术辅助工具的发展。