Traditional bibliography databases require users to navigate search forms and manually copy citation data. Language models offer an alternative: a natural-language interface where researchers write text with informal citation fragments, which are automatically resolved to proper references. However, language models are not reliable for scholarly work as they generate fabricated (hallucinated) citations at substantial rates. We present an architectural approach that combines the natural-language interface of LLM chatbots with the accuracy of direct database access, implemented through the Model Context Protocol. Our system enables language models to search bibliographic databases, perform fuzzy matching, and export verified entries, all through conversational interaction. A key architectural principle bypasses the language model during final data export: entries are fetched directly from authoritative sources, with timeout protection, to guarantee accuracy. We demonstrate this approach with MCP-DBLP, a server providing access to the DBLP computer science bibliography. The system transforms form-based bibliographic services into conversational assistants that maintain scholarly integrity. This architecture is adaptable to other bibliographic databases and academic data sources.
翻译:传统文献数据库要求用户通过搜索表单导航并手动复制引用数据。语言模型提供了一种替代方案:一种自然语言界面,研究人员可使用非正式引用片段撰写文本,这些片段将自动解析为规范参考文献。然而,语言模型在学术工作中并不可靠,因为它们会以较高频率生成虚构(幻觉)引用。我们提出一种架构方法,通过模型上下文协议实现,将LLM聊天机器人的自然语言界面与直接数据库访问的准确性相结合。我们的系统使语言模型能够通过对话交互搜索文献数据库、执行模糊匹配并导出已验证条目。一个关键的架构原则是在最终数据导出时绕过语言模型:条目在超时保护机制下直接从权威来源获取,以保证准确性。我们通过MCP-DBLP(一个提供DBLP计算机科学文献库访问的服务器)验证了该方法。该系统将基于表单的文献服务转化为保持学术完整性的对话式助手。该架构可适配其他文献数据库和学术数据源。