The Model Context Protocol (MCP) has emerged as a de facto standard for integrating Large Language Models with external tools, yet no formal security analysis of the protocol specification exists. We present the first rigorous security analysis of MCP's architectural design, identifying three fundamental protocol-level vulnerabilities: (1) absence of capability attestation allowing servers to claim arbitrary permissions, (2) bidirectional sampling without origin authentication enabling server-side prompt injection, and (3) implicit trust propagation in multi-server configurations. We implement \textsc{MCPBench}, a novel framework bridging existing agent security benchmarks to MCP-compliant infrastructure, enabling direct measurement of protocol-specific attack surfaces. Through controlled experiments on 847 attack scenarios across five MCP server implementations, we demonstrate that MCP's architectural choices amplify attack success rates by 23--41\% compared to equivalent non-MCP integrations. We propose \textsc{MCPSec}, a backward-compatible protocol extension adding capability attestation and message authentication, reducing attack success rates from 52.8\% to 12.4\% with median latency overhead of 8.3ms per message. Our findings establish that MCP's security weaknesses are architectural rather than implementation-specific, requiring protocol-level remediation.
翻译:模型上下文协议(MCP)已成为大型语言模型与外部工具集成的事实标准,然而目前尚不存在对该协议规范的正式安全分析。本文首次对MCP的架构设计进行严格的安全分析,识别出三个基础协议级漏洞:(1)缺乏能力证明机制,允许服务端声明任意权限;(2)双向采样缺乏来源认证,导致服务端提示注入攻击;(3)多服务器配置中存在隐式信任传播。我们实现了\textsc{MCPBench}——一个将现有智能体安全基准与符合MCP标准的基础设施相连接的新型框架,能够直接测量协议特定的攻击面。通过对五个MCP服务端实现的847个攻击场景进行受控实验,我们证明相较于等效的非MCP集成方案,MCP的架构选择使攻击成功率提升了23--41\%。我们提出\textsc{MCPSec}——一个向后兼容的协议扩展方案,通过增加能力证明和消息认证机制,将攻击成功率从52.8\%降低至12.4\%,每条消息的中位延迟开销为8.3毫秒。我们的研究证实MCP的安全缺陷属于架构层面而非具体实现问题,需要进行协议级别的修复。