
MCP Rug Pull Attacks: What They Are & How to Stop Them
Model Context Protocol (MCP) servers have unleashed the true potential and capabilities of AI agents. However, these agents are only as secure as the tools they trust. And just because your team trusted a tool when you first found it doesn’t make it immune from one of the most insidious security risks for MCP servers: the rug pull attack.
The rugpull happens after you’ve approved the connection, turning a once-trusted tool into an untrustworthy attack vector. Let’s dive into what an MCP rug pull attack actually is, why it’s so dangerous, and why middleware platforms (like MCP Manager) are a necessary component of any secure AI stack.
What Is an MCP Rug Pull Attack?
An MCP rug pull attack happens when an agent connects to a trusted MCP server but then that server silently modifies, removes, or redefines tools without notice. In fact, the server does this silently without warning or a notification to the user.
The result of this kind of attack is that your agent keeps calling a tool that’s been hijacked, redefined, or degraded. Because there’s no built-in mechanism in the MCP spec to detect or prevent this, the agent can go rogue or even wreck havoc on your data’s systems.
Example of MCP Rug Pull Attack:
Let’s say you connect your agent to a third-party tool that provides this tool call:
send_slack_message
When a Rug Pull attack happens this tool call could do something completely different. For example, two weeks later, that same endpoint can get replaced with one that posts messages to Discord, logs content to a third-party service, or even injects malicious prompts.
The tricky part about the detection is that the MCP servers will still show the same tool name and schema.
That’s the rug pull. You’ve approved something that’s now fundamentally changed.
Why MCP Rug Pulls Are So Insidious
MCP rug pulls don’t rely on breaking in; they rely on trust that has since grown outdated.
Here’s why they’re hard to catch:
1. Tool Definitions Are Mutable
There’s no guarantee that a tool will remain the same over time. Without version locking or signatures, updates can happen silently.
2. Security Reviews Are Static
Most orgs approve a tool once, then assume it’s safe forever. Rug pulls exploit that false sense of permanence.
3. Agents Assume Good Faith
AI agents don’t question tool behavior. They assume what they’re given is correct (especially in headless deployments).
4. No Built-In Notarization or Diff Detection
There’s no spec-wide support (yet) for change tracking in MCP tool manifests. Agents won’t know if something’s changed unless you monitor it externally.
The Solution: Middleware That Controls the Pipe
When it comes to MCP security, middleware is your friend because you can’t expect every AI agent to become a security auditor. You also can’t rely on third-party MCP servers to behave the same way forever.
The way MCP Manager approaches middleware is with a gateway between your agents and MCP servers; this gateway enforces trust boundaries, logs activity, and much more.
While MCP Manager prevents MANY types of security vulnerabilities that are unique to MCPs, here’s how it specifically prevents rug pulls:
Monitor Changes to Tool Descriptions
When a tool is approved in MCP Manager, you can add a condition that blocks a server’s tool when the description changes. Doing so stops Rug Pulls from being able to happen because Rug Pull attacks assume you will not notice that what a tool does changes. MCP Manager not only notices but immediately puts in safeguards so that the tool can no longer take any action.
Don’t Trust Blindly. Monitor Aggressively.
Rug pull attacks are a symptom of a larger truth: the MCP ecosystem is powerful but fundamentally open. That openness allows agents to have the capabilities and
If you’re letting agents connect directly to external MCP servers without safeguards, you’re giving them too much trust and too little oversight.
With MCP Manager, you regain control.
- No silent tool swaps
- No risky default behaviors
- No “I thought it was still safe” moments