Enhancing Model Context Protocol (MCP) with Context-Aware Server Collaboration

The Model Context Protocol (MCP) (MCP Community, 2025) has emerged as a widely used framework for enabling LLM-based agents to communicate with external tools and services. The original MCP implementation (Anthropic, 2024) relies on a Large Language Model (LLM) to decompose tasks and issue instructions to servers. In particular, the agents, models, and servers are stateless and do not have access to a global context. However, in tasks involving LLM-driven coordination, it is natural that a Shared Context Store (SCS) could improve the efficiency and coherence of multi-agent workflows by reducing redundancy and enabling knowledge transfer between servers. Thus, in this work, we design and assess the performance of a Context-Aware MCP (CA-MCP) that offloads execution logic to specialized MCP servers that read from and write to a shared context memory, allowing them to coordinate more autonomously in real time. In this design, context management serves as the central mechanism that maintains continuity across task executions by tracking intermediate states and shared variables, thereby enabling persistent collaboration among agents without repeated prompting. We present experiments showing that the CA-MCP can outperform the traditional MCP by reducing the number of LLM calls required for complex tasks and decreasing the frequency of response failures when task conditions are not satisfied. In particular, we conducted experiments on the TravelPlanner (Yang et al., 2024) and REALM-Bench (Geng & Chang, 2025) benchmark datasets and observed statistically significant results indicating the potential advantages of incorporating a shared context store via CA-MCP in LLM-driven multi-agent systems.

翻译：模型上下文协议（Model Context Protocol, MCP）（MCP Community, 2025）已成为一种广泛使用的框架，用于使基于大型语言模型（LLM）的智能体能够与外部工具和服务进行通信。原始的MCP实现（Anthropic, 2024）依赖于一个大型语言模型（LLM）来分解任务并向服务器发出指令。具体而言，其中的智能体、模型和服务器是无状态的，无法访问全局上下文。然而，在涉及LLM驱动的协调任务中，一个共享上下文存储（Shared Context Store, SCS）很自然地可以通过减少冗余并实现服务器间的知识传递，来提高多智能体工作流的效率和一致性。因此，在本工作中，我们设计并评估了一种上下文感知MCP（Context-Aware MCP, CA-MCP）的性能。该协议将执行逻辑卸载到专门的MCP服务器，这些服务器能够读写共享的上下文内存，从而使它们能够更自主地进行实时协调。在此设计中，上下文管理作为核心机制，通过跟踪中间状态和共享变量来维持跨任务执行的连续性，从而使得智能体之间能够进行持久协作，而无需重复提示。我们通过实验表明，CA-MCP能够超越传统的MCP，减少复杂任务所需的LLM调用次数，并在任务条件不满足时降低响应失败的频率。具体而言，我们在TravelPlanner（Yang et al., 2024）和REALM-Bench（Geng & Chang, 2025）基准数据集上进行了实验，并观察到了具有统计学意义的结果。这些结果表明，在LLM驱动的多智能体系统中，通过CA-MCP引入共享上下文存储具有潜在优势。