Reference lists in scholarly manuscripts frequently contain errors, including incorrect identifiers, incomplete metadata, misattributed authors, and mismatches between preprint and published versions. These problems are tedious to repair manually and have become more visible in workflows that rely on large language models, which can fabricate or corrupt citations. We present citecheck, a TypeScript system and MCP server for automated bibliographic verification and repair in paper-like project folders. Given a manuscript file or workspace, citecheck selects the most likely paper artifact, extracts references from .bib, .tex, .md, .txt, or .docx, validates entries against PubMed, Crossref, arXiv, and Semantic Scholar, and returns structured correction proposals together with replacement-safety diagnostics. The current repository provides a working research prototype with multi-pass retrieval, manifestation-aware matching, policy-gated rewrite planning, and 47 passing tests covering repair behavior, malformed payload handling, transport failures, and MCP exposure. We position citecheck as infrastructure for agentic scholarly editing and as a practical guardrail against both traditional reference errors and LLM-induced citation hallucinations.
翻译:学术手稿的参考文献列表常存在错误,包括标识符错误、元数据不完整、作者归属错误以及预印本与正式版本不匹配等问题。这些错误不仅人工修正繁琐,且在依赖大型语言模型的工作流程中更为凸显——大模型可能生成或篡改引文。本文提出 citecheck——一种面向论文式项目文件夹的 TypeScript 系统及 MCP 服务器,用于自动验证与修复文献信息。给定手稿文件或工作空间后,citecheck 选取最可能的论文实体,从 .bib、.tex、.md、.txt 或 .docx 文件中提取参考文献,通过 PubMed、Crossref、arXiv 和 Semantic Scholar 验证条目,并返回结构化修正建议及替换安全诊断。当前代码库提供了可运行的研究原型,具备多轮检索、显式匹配、策略门控重写规划等功能,并通过47项测试覆盖修复行为、畸形载荷处理、传输故障及 MCP 暴露场景。我们将 citecheck 定位为智能化学术编辑的基础设施,以及抵御传统参考文献错误与LLM诱导引用幻觉的实用防护机制。