A Taxonomy of Runtime Faults in Model Context Protocol Servers

MCP (Model Context Protocol) enables LLMs (Large Language Models) to interact with external tools and data sources via a standardized protocol. Its rapid adoption in tool-augmented Artificial Intelligence (AI) workflows has introduced new reliability challenges, such as configuration parameters that are accepted but not enforced at runtime, leading to unintended default behavior, whose runtime fault characteristics remain empirically unexamined. We present the first empirical taxonomy of runtime faults in MCP servers. We manually analyzed 837 MCP-specific runtime fault threads from 473 actively maintained MCP server GitHub repositories and derived a taxonomy using a bottom-up open coding procedure. The taxonomy comprises 11 top-level categories and 27 subcategories (73 leaf fault types), covering recurrent failures across protocol interactions, tool invocations, schema enforcement, state management, model-provider integration, security validation, and timeouts or explicit cancellations of in-progress operations. To assess the taxonomy's external validity, we surveyed 55 MCP server developers. Respondents reported experiencing an average of 20 of the 27 fault subcategories, and no category remained unobserved. These results indicate that the taxonomy reflects widely observed runtime failures in MCP-based systems and shall assist AI software maintenance and evolution in the future.

翻译：MCP（模型上下文协议）通过标准化协议使大语言模型（LLM）能够与外部工具和数据源交互。其在工具增强型人工智能工作流中的快速应用引入了新的可靠性挑战，例如配置参数虽被接受但未在运行时强制执行，导致非预期的默认行为——此类运行时故障特征尚未得到实证研究。我们首次提出了MCP服务器运行时故障的实证分类法。手动分析了来自473个活跃维护的MCP服务器GitHub仓库的837个MCP特定运行时故障线程，并采用自下而上的开放式编码程序推导出分类法。该分类法包含11个顶层类别和27个子类别（73个叶级故障类型），涵盖了协议交互、工具调用、模式强制、状态管理、模型提供商集成、安全验证以及运行中操作的超时或显式取消等环节的反复性失败。为评估分类法的外部有效性，我们调查了55位MCP服务器开发者。受访者报告平均经历了27个故障子类别中的20个，且无任何类别未被观测到。这些结果表明，该分类法反映了MCP系统中广泛观测到的运行时故障，并将有助于未来AI软件的维护与演进。

相关内容

服务器

关注 14

服务器，也称伺服器，是提供计算服务的设备。由于服务器需要响应服务请求，并进行处理，因此一般来说服务器应具备承担服务并且保障服务的能力。
服务器的构成包括处理器、硬盘、内存、系统总线等，和通用的计算机架构类似，但是由于需要提供高可靠的服务，因此在处理能力、稳定性、可靠性、安全性、可扩展性、可管理性等方面要求较高。

从静态模板到动态运行时图：大语言模型智能体（LLM Agents）工作流优化综述

专知会员服务

23+阅读 · 3月30日

大型语言模型系统中提示缺陷的分类学

专知会员服务

8+阅读 · 2025年9月19日

AgentOps综述：分类、挑战与未来方向

专知会员服务

40+阅读 · 2025年8月6日

《基于MCP的软件设计模式视角下的大型语言模型智能体通信研究综述》

专知会员服务

45+阅读 · 2025年6月9日