How Code Mode Builds on MCP for Agent Tooling

In the last few months, multiple vendors across the agentic ecosystem have independently embraced a similar pattern referred to as code mode. Instead of thinking of Model Context Protocol (MCP) solely as a protocol for issuing JSON-RPC tool calls, the code mode pattern treats MCP schemas as a foundation for generating typed client libraries that an LLM can use inside a controlled code-execution environment.

Code mode is an abstraction layer on top of MCP that turns MCP tool calls into a specification. Its advocates say the approach improves accuracy when working with many MCP servers and reduces runaway token use. In this post, we take a brief look at code mode and consider some of the implications it has for the future of MCP.

What Is Code Mode?

While MCP offers strong architectural advantages, implementers recently discovered certain limitations when relying purely on tool calls. That’s where code mode comes into play.

The Idea Behind Code Mode

Code mode leverages MCP’s protocol and metadata, but changes how tools are invoked. Code mode is an integration pattern, not a protocol. Instead of exposing a long list of discrete tools for the large language model (LLM) to pick and call, code mode converts MCP tool definitions into a TypeScript (or code) API, then asks the LLM to write code that calls that API.

Code mode treats tools as programming-language abstractions rather than special LLM tool calls. This aligns with the fact that modern LLMs are often trained on huge amounts of real-world code but far fewer synthetic tool call examples.

How Code Mode Works (Overview)

Instead of exposing dozens of discrete MCP tools directly to an LLM, code mode generates a fully typed client library from the MCP server’s schema and then allows the model to interact with that API through a single “write-and-run code” entry point.

That code is executed inside a tightly controlled environment like a sandbox. Because the LLM writes code against a pre-authorized client, credentials never have to be exposed to the model itself.

This code-as-interface pattern does not replace MCP — it builds on its schema-first foundation. The model gains a more natural programming environment, while the agent gains tighter control over what is allowed to run, making real-world agent behavior more predictable, expressive, and robust.

Comparing MCP vs. Code Mode: Strengths, Trade-offs, and Use Cases

Both MCP and code mode aim to solve similar problems, but they approach the problem differently. The differences manifest most clearly when we consider concrete MCP use cases and trade-offs.

When MCP Is All You Need

Simplicity and uniform tool exposure: For agents that need a handful of simple tools, MCP’s native tool-call interface may suffice. Developers do not need to generate APIs or manage a sandbox. The agent just needs to call a tool, receive JSON, and continue.
Language-agnostic adoption: Because MCP is a protocol and not tied to a particular language, it supports clients and servers written in many languages.
Rapid prototyping and integration: For lightweight agents or quick MVPs, using existing MCP servers lets developers integrate tools quickly, without building custom language APIs or sandbox environments.
Interoperability and ecosystem leverage: With MCP’s growing ecosystem of servers, clients, and integrations, developers can mix and match components with minimal customization. This matters particularly in heterogeneous or cross-platform deployments.

When Code Mode Becomes Useful

Complex workflows and tool-chaining: When tasks require multiple sequential tool calls, code mode excels. By letting the LLM write code that glues together multiple tool calls internally, you reduce token overhead, avoid repeated handoffs, and simplify data flow.
Better reliability and naturalness: Real-world LLM training data tends to include much more real code than synthetic tool-call JSON examples. By leveraging that, code mode can produce more fluent, semantically correct calls, with fewer formatting errors, fewer failed tool calls, and better handling of edge cases.
Security and sandboxing: Because code mode runs generated code in a sandbox using V8 isolates with only controlled bindings to MCP servers, it allows fine-grained access control. Credentials stay hidden from the LLM, and external access is limited to authorized MCP bindings.
Scalable for large toolsets: If you want to expose many tools, it becomes unwieldy to present all as distinct tool calls. Code mode’s API approach scales better: one clean namespace, typed interfaces, fewer special tokens, and behavior that looks more like traditional programming.

Trade-offs and Risks

Neither approach is a silver bullet. There are trade-offs and risks to weigh.

MCP security issues: Because MCP exposes tools as actions the LLM can call, misconfigured or malicious MCP servers can pose security risks, such as exposing sensitive data or enabling unintended actions. A recent study showed that when MCP is used in large agentic workflows, there are opportunities for credential leaks, prompt injection, or malicious tool calls.
Complexity of the sandbox environment: Code mode requires a sandbox plus safe bindings. That adds overhead, including building or using a secure sandbox runtime, managing bindings, ensuring the sandbox cannot leak sensitive data, and controlling resource usage. Not all deployment environments may easily support this.
Debugging complexity: With code mode, the LLM generates code, which might be buggy, inefficient, or nonsensical. Developers may need to debug LLM-generated code, handle runtime exceptions, and ensure that the sandboxed code interacts correctly with MCP-backed services.
Loss of protocol-only simplicity: Part of MCP’s appeal is its protocol-agnostic, language-agnostic minimalism. Code mode introduces a dependency on a particular language and execution environment, which may reduce cross-language flexibility or add constraints in polyglot systems.
Trust and compliance concerns: Even with sandboxing, giving an LLM the power to generate and execute code means you must trust the sandboxing environment. For high-security or compliance-heavy contexts, this may require stringent auditing, resource limits, or manual reviews. This is less of an issue for Anthropic and Cloudflare, though, both of which have code mode built in.

Use Case Scenarios: Which to Use When

When you are building a simple chatbot that only needs to read from a configuration file and return its contents, native MCP tool calls are generally the best fit. This kind of task is low risk, easy to implement, and does not require the overhead of generating or executing sandboxed code.

An enterprise assistant that queries databases, interacts with several cloud services, generates reports, and enforces audit logging or permission controls can still use native MCP effectively, especially when it is combined with strong authentication and logging layers.

If the system involves long, multi-step sequences or complex procedural logic, pairing MCP with a code-generation-based workflow in a sandbox can be much more flexible.

To build highly capable developer assistants that can edit code, work with repositories, run analysis tools, and chain actions like fetching a file, analyzing it, and committing changes, code mode tends to be the stronger choice. The model can generate structured code coordinating many tools, while the sandbox ensures safety and predictability.

For research work or rapid prototyping, native MCP is usually the better choice because it’s lightweight, fast to wire up, and avoids unnecessary complexity when trying out different tools.

Finally, for scenarios that require multiple agents to coordinate, call different tools, share context, and manage involved workflows, a hybrid architecture often works best. Code mode excels at handling heavier logical orchestration within each agent, while MCP, either native or sandboxed, provides standardized, protocol-level access to tools across the entire system.

What Code Mode Changes — and What That Means for the Future

The rise of code mode across the agentic ecosystem signals a shift in how developers think about MCP. Rather than treating MCP purely as a protocol for issuing discrete tool calls, code mode reframes it as a schema and metadata layer that LLMs can use to generate real code. The model then executes that code inside a secure sandbox, giving developers both the standardization MCP provides and the expressiveness of conventional programming.

This matters because LLMs are trained on code, not on custom tool-call syntaxes. Allowing them to interact with tools through auto-generated APIs written in familiar languages like TypeScript takes advantage of their inherent strengths. It also scales better for complex, multi-step operations. The code can manage branching logic, loops, retries, and orchestration in ways that native tool calls struggle to express cleanly. Sandboxed execution adds safety and determinism while avoiding the overhead of full containerization.

MCP remains essential. Its language-agnostic protocol, interoperability, and consistent schema model make it the foundation on which code mode is built. Native tool calls still excel in simple, low-risk, or prototype scenarios where minimal overhead is desirable. Code mode thrives in workflows requiring rich logic, multiple tools, or procedural control.

Looking ahead, neither approach will dominate alone. The most capable agent architectures will likely blend both. MCP will be used for standardized connectivity and schemas, while code mode will be used for high-level orchestration and complex logic, all secured by sandboxed execution and pre-authorized bindings. This hybrid model points toward the future of agentic AI — interoperable, expressive, and reliably grounded in real-world development patterns.

AI Summary

This article explains how code mode builds on the Model Context Protocol (MCP) to improve how AI agents interact with tools, moving beyond native tool calls toward code-based execution models.

Code mode treats MCP schemas as a foundation for generating typed client libraries that AI agents can use inside controlled code-execution environments.
Instead of issuing discrete MCP tool calls, agents write and execute code that coordinates multiple tools through a single, sandboxed entry point.
This approach aligns better with how large language models are trained, improving reliability, reducing formatting errors, and lowering token overhead in complex workflows.
MCP remains the underlying protocol and schema layer, providing language-agnostic interoperability while code mode adds procedural logic, orchestration, and execution control.
Native MCP tool calls continue to work well for simple or low-risk tasks, while code mode excels in multi-step, security-sensitive, or large-scale agent workflows.

Intended for API architects, platform engineers, and AI developers evaluating MCP-based integrations and emerging agent tooling patterns.