MCP security: the risks of the Model Context Protocol nobody's talking about yet
In-depth analyses of real-world cyber incidents and emerging threat trends, authored exclusively by our analysts.
If your AI product uses the Model Context Protocol, or MCP, to connect your agents to tools and data sources, there is a set of security risks specific to that protocol that almost nobody outside a small circle of researchers is discussing yet. MCP has become the backbone connecting AI models to external tools, and its adoption has moved far faster than awareness of its security implications. This article covers what those risks actually are, using the real vulnerability classes researchers have identified, and what to do about them if MCP is part of your stack.
What MCP is and why it matters for security
MCP is a standard that lets an AI model discover and call external tools and data sources in a consistent way, rather than every integration being built bespoke. It has rapidly become one of the default ways AI agents get connected to the outside world, from internal company data to third party services.
The security problem is structural, not incidental. An MCP server exposes tool definitions, descriptions, and permissions to the AI model, and the model trusts that information in order to decide what to do. Because the protocol is built for AI to consume that information automatically, at machine speed, with limited human review of every tool and every call, it creates a genuinely new category of trust relationship. You are not just trusting the tools your product calls. You are trusting the metadata that describes those tools, and the model's interpretation of it.
Security researchers have started publishing the first serious classification of these risks, including an OWASP specific top ten for MCP, which is itself a signal of how quickly this has become a recognised concern. It is worth understanding the specific risks now, while comparatively few AI products have properly addressed them.
Tool poisoning, the risk that matters most
This is the vulnerability class that has drawn the most serious research attention, and for good reason. Tool poisoning happens when an attacker plants malicious instructions inside a tool's description or metadata, rather than in its actual function. The model reads that description in full, including any hidden instructions, but the user only sees a clean, simple tool name.
Concretely, this means a tool that appears to a user as something harmless, like "read file,” can contain hidden instructions in its underlying description that manipulate the model into misusing it, for example calling a delete function instead, or prioritising a weaker, attacker controlled tool over a safer one. Because the manipulation lives in metadata the AI processes but a human rarely inspects, it is a genuinely difficult attack to catch without deliberately looking for it. Research testing multiple AI agents against real MCP servers has found that most tested systems were vulnerable to this class of attack in some form.
Rug pulls, the risk that changes after you approve it
This is a specific and unsettling variant worth understanding on its own. A rug pull occurs when a tool you reviewed and approved, and which behaved safely at the time, is later silently updated by whoever controls it to include malicious behaviour, without triggering any re approval or alert on your side.
In practical terms, this means trusting an MCP tool once is not the same as trusting it permanently. If your product connects to third party MCP servers you do not control, the tool that was safe when you integrated it can change under you, and unless you have something watching for that, you may not notice until something goes wrong.
Over-permissioned tools and the weak human oversight requirement
A third recurring theme in the research is that MCP tools are often granted far more privilege than the task actually requires, and the protocol's own specification does not force better behaviour. The MCP specification currently only says human review "should” happen for consequential actions, not that it must, which leaves a meaningful gap that individual implementations either close properly or leave wide open.
Combined with over permissioned tools, this means an agent that only needed to read a piece of data can often also write, delete, or call other tools it never needed access to, and the guardrail that should catch a risky action before it happens is optional rather than required by the standard itself.
Why client side, not just server side
Much of the early attention on MCP security focused on securing the MCP server, the thing exposing the tools. More recent research has shown that the client, the software that connects your AI to those servers, is at least as important, and current implementations vary widely in how carefully they validate what they receive. Testing across several major MCP clients found meaningful gaps in how thoroughly they checked tool metadata before trusting it. If you are building or choosing an MCP client, the validation it performs before trusting a tool's description is a real security decision, not a minor implementation detail.
Confirmed vulnerabilities, not just theory
This is not speculative research. A publicly documented vulnerability has already been assigned an official identifier for a lack of authentication in a widely used MCP proxy component, and security researchers have published real world notifications of tool poisoning attacks found in the wild. The pattern is the same one seen across AI security generally, where a risk identified in research quickly turns up in production before most teams have caught up.
What to actually do if MCP is part of your stack
The defences here follow the same principles as wider AI agent security, applied specifically to how you handle MCP.
- Treat tool descriptions as untrusted input, not documentation. Do not assume a tool's description is safe just because its name looks harmless. Where possible, use or build validation that inspects the full metadata a model would see, not just the summary a human sees.
- Enforce least privilege on every tool, strictly. Do not rely on the protocol's own "should” language for oversight. Decide for yourself which actions genuinely require a human checkpoint, and build that in as a requirement, not an option.
- Pin and monitor the tools you depend on. If a tool you use can be updated by a third party without your involvement, treat any change as something that needs review before it is trusted again, rather than assuming continuity of safety.
- Scrutinise your MCP client's validation, not just your server's security. If you are choosing or building the client side of this relationship, ask specifically how it validates tool metadata before your model ever sees it.
- Keep an inventory of every MCP server and tool your product connects to. You cannot secure what you have not catalogued, and this is an area where AI teams often lose track quickly as integrations multiply.
Why this matters for an enterprise buyer conversation
MCP is new enough that most security questionnaires do not yet ask about it directly, but that is changing quickly, and sophisticated buyers are starting to ask how AI products manage tool and agent connections. Being able to speak clearly to how you validate tools, constrain their privileges, and monitor for changes will increasingly be a genuine differentiator, in the same way prompt injection awareness has become one over the past year. Getting ahead of a question before it becomes standard is exactly the position you want to be in.
The honest takeaway
MCP has become critical infrastructure for AI products remarkably quickly, and its security risks, tool poisoning, rug pulls, over permissioned tools, and weak client side validation, are real, documented, and still largely unaddressed across the industry. Being one of the few AI startups who has actually thought this through is currently a genuine advantage, both because it makes your product safer and because it is a differentiator very few competitors can currently claim.
If MCP is part of how your AI agents work, treat its specific risks as seriously as you treat prompt injection generally, because the researchers studying this closely are telling us it deserves exactly that level of attention.
Using MCP to connect your AI agents to tools and data?
Book a free review and we'll check how your MCP integrations are secured against tool poisoning and related risks.
AI Security Insights
MCP security: the risks of the Model Context Protocol nobody's talking about yet
If your AI product uses the Model Context Protocol, or MCP, to connect your agents to tools and data sources, there is…
Read articleAI security glossary: 30 terms every founder should know before an enterprise review
Enterprise security reviews come packed with terminology that nobody explains before you need it. Founders often encoun…
Read articleWhat is a security.txt file and does your AI startup need one
If you have never heard of a security.txt file, you are not alone, and yet it is one of the smallest, cheapest pieces o…
Read articleSub-processors explained: what they are and why enterprise buyers ask for your list
Somewhere in an enterprise security review, you will almost certainly be asked for your list of sub-processors. If you…
Read articleMore insights, delivered monthly
Get the latest insights on AI security and compliance.

