Transparent and standardized reporting is essential for reproducible scientific research, yet adherence to reporting guidelines remains inconsistent because of the manual effort required to select and complete checklists. We present CheckSupport, an open-source, locally deployable system that uses large language models to automate the recommendation of reporting checklists and the evidence-grounded completion of checklists for scientific manuscripts. CheckSupport employs a staged prompting strategy that decomposes reporting workflows into constrained inference tasks, prioritizing faithful extraction over generative text synthesis. All inference is performed locally using instruction-tuned models, preserving data privacy and enabling reproducible, auditable workflows. Evaluated on a corpus of peer-reviewed manuscripts, CheckSupport achieved 90% overall accuracy for checklist recommendations and 88% overall accuracy for item-level completion while operating on CPU-only hardware. On average, the wall-clock time per manuscript was 12.5 seconds, including the checklist recommendation and full checklist completion. These results demonstrate that large language models, when applied as structured inference components, can reduce reporting burden and support more transparent and reproducible scientific reporting across disciplines.
翻译:透明且标准化的报告对于可重复的科学论文至关重要,但由于清单选择与填写需耗费大量人工精力,报告指南的依从性仍不理想。本文提出CheckSupport——一个开源的本地可部署系统,利用大语言模型自动推荐报告清单并基于证据完成科学论文的清单填写。该系统采用分阶段提示策略,将报告工作流程分解为约束性推理任务,优先保证忠实提取而非生成式文本合成。所有推理均在本地通过指令微调模型完成,既保障数据隐私又支持可重复、可审计的工作流程。经同行评审论文语料评估,CheckSupport在仅使用CPU硬件时,清单推荐的总体准确率达90%,条目级填写的总体准确率达88%。每篇论文的平均处理时间(包含清单推荐与完整填写)为12.5秒。实验结果表明,将大语言模型作为结构化推理组件,可减轻报告负担,并推动跨学科论文报告的透明化与可重复性提升。