The role of Model Context Protocol (MCP) in Generating AI Security and Red Teams

Overview

Model Context Protocol (MCP) is an open, JSON-RPC-based standard server Revealing three primitives –tool,,,,, resourceand hint– Define transportation (mainly Stio For local and Streaming http for remote). The value of MCP for security work is that it enables agents/tool ​​interactions to be clear and audited, and has specification requirements around authorizations that teams can validate in code and tests. In practice, this makes it very nervous Blast-Radius control Used for tools, repeatable Red team scene In clear trust boundaries and measurable policy implementation –if Organizations consider MCP servers as a privileged connector that has been reviewed by supply chain.

What MCP standardization?

MCP server release: (1) tool (Model callable modes similar to actions), (2) resource (Readable data object clients can get and inject it into context), (3) hint (Reusable, parameterized message templates, usually user-started). Distinguishing these surfaces can clarify who “controls” at each edge: model drivers for tools, application drivers for resources, and user-driven prompts. These roles are important in threat modeling, for example, injecting paths that are usually controlled by the model, while unsafe output processing usually occurs at the connections controlled by the application.

transportation. This specification defines two standard shipping −Stio (standard input/output) and Streaming http– and leave room for pluggable alternatives. Local STDIO reduces network exposure; streaming HTTP is suitable for multi-customer or web deployments and supports resume-friendly streaming. Think of shipping selection as security control: Restrict the network egress of the local server and apply standard Web Authn/Z to remote servers.

🚨 [Recommended Read] Vipe (Video Pose Engine): A powerful and universal 3D video annotation tool for space AI

Client/server lifecycle and discovery. How MCP formally discovers server features (tools/resources/tips), negotiates meetings and exchanges messages. This uniformity is to let security team instruments call traffic, capture structured logs, and advocate for pre/post conditions for each integrated without custom adapter.

Standardize authorization control

this Authorization For the integration protocol, the method is exceptionally specified and should be performed as follows:

  • No token passed. “MCP Server cannot The token received from the MCP client through it. “The server is Oauth 2.1 Resource Server;Customers use authorization server to obtain tokens RFC 8707 Resource Indicators Therefore, the token is a limit to the audience of the intended server. This prevents messy paths and preserves upstream audit/restriction controls.
  • Audience binding and verification. server Must be verified Before the service request, the audience of the access token accesses itself (resource binding). Operationally, this prevents the client’s token “Service A” from being replayed as “Service B”. The Red Team should include clear probes regarding this failure mode.

This is the core of the MCP security structure: the model side functions are powerful, but the protocol insists that the server is first-class headmaster With its own credentials, scopes and logs, a more opaque pass can be used for a user’s global token.

MCP supports security engineering in practice?

Clear boundaries of trust. Client↔ Server Edge is a clear checkable boundary. You can attach consent UIS, oscilloscope prompts and structured records to that edge. Many customer implementations present licensing tips, enumerating these licenses before enabling the server’s tools/resources, and can be used even if the standard specifies UX (can be used for at least challenges and audits).

Containment and minimal privilege. Since the server is a separate principal, you can perform a minimum upstream range. For example, Secrets-broker Server can mint short-lived credentials and reveal only constrained tools (e.g., “Policy Tag Get Secrets”), rather than handing leniency insurance companies to the model. The public MCP server of the security vendor illustrates this model.

Red team’s deterministic attack surface. Using typing tool mode and replayable shipping, the Red Team can build Fixing device Simulate adversarial inputs on tool boundaries and validate post-conditions across models/clients. This results in categories such as reproducible testing, timely injections, unsafe output handling, and supply chain abuse. Pair these tests with a well-recognized taxonomy.

Case Study: The First Malicious MCP Server

In late September 2025, researchers revealed Trojanization postmark-mcp NPM package This impersonates a postmark email MCP server. from V1.0.16malicious construction silently bcc falls off Through it, every email sent to the attacker-controlled address/domain. The package was subsequently removed, but guidance urged uninstalling the affected version and rotating the credentials. This seems to be the first publicly recorded Malicious MCP server in the wildAnd it is emphasized that MCP servers usually run with high trust and should be reviewed and versioned like any privileged connector.

Operating takeaway:

  • Keep one Allowed list Approved server and pin/hash.
  • Require Code source (Signed version, SBOM) is used for production servers.
  • Monitor abnormal export patterns consistent with BCC penetration.
  • practice Credential rotation “Bulk Disconnect” exercise integrated with MCP.

These are not theoretical controls; event effects flow directly from over-trusted server code in a regular developer workflow.

Use MCP to build red team exercises

1) Perform rapid injection and unsafe output drill at the tool boundary. Built by resource (Context of requesting control) and attempt to coerce the call of the dangerous tool. Assert that the client sanitizes the injected output and the post-server condition (e.g., allows hostname, file path) to be saved. Map survey results LLM01 (Injection prompt) and LLM02 (Unsafe output processing).

2) Obfuscated probes for token abuse. Process tasks attempt to induce server use Sent from the customer Token or call an unexpected upstream audience. Compliant servers must reject diplomatic credit tokens in accordance with authorization specifications; customers must request tokens for audience correction RFC 8707 Resources. Treat any success here as P1.

3) Session/Strive elasticity. For remote transport, multi-customer concurrency with exercise reconnect/recovery flow and session fixation/hijacking risks. Verify the nondeterministic session ID and fast expiration/rotation in load balancing deployment. (Flow http supports reconnection; use it to emphasize your session model.)

4) Supply chain killing chain drill. In the lab, insert the server with benign marker Trojan and verify that your allowable list, signature check and exit detection have captured it – refreshed the postmark event TTPS. Measure the time when the detection and credentials rotate MTTR.

5) The baseline has a trusted public server. Use a reviewed server to build deterministic tasks. Two practical examples: Google’s data sharing MCP Public datasets are exposed under a stable architecture (suitable for fact-based tasks/replays), and Delinea’s MCP The secret of minimal privilege is demonstrated for the agent workflow. These are the ideal substrates for repeatable jailbreaks and policy enforcement testing.

A list of secure hard drives focused on implementation

Client

  • show Exact command Or configuration for starting the local server; the gate starts below the explicit user consent and lists enabled tools/resources. Continuous approval range granularity. (This is a common practice for customers like Claude Desktop.)
  • Keep one Allowed list Server with fixed version and checksum checksum checksum checksum checksum checksum checksum checksum checksum checksum checksum checksum checksum checksum checksum checksum checksum checksum checksum checksum checksum checksum checksum servers with fixed versions and checksum checksum checksum checksum checksum servers; by default, unknown servers are rejected.
  • log Each tool call (name, argument metadata, principal, decision) and Resource acquisition Use identifiers so you can rebuild the attack path after the fact.

Server side

  • Implement OAuth 2.1 Resource Server Behavior; Verify tokens and audiences; no way Upstream tokens issued to forward customers.
  • Minimize range; prefer A brief certificate and encoding strategies (e.g., “tag get secrets” instead of free form reading).
  • For local deployments, priority Stio In container/sandbox and restricted file system/network functions; for remote, use Streaming http With TL, rate limiting and structured audit logs.

Detection and response

  • Alert exception server exit (unexpected destination, email BCC mode) and sudden functional changes between versions.
  • Prepare Break the glass When the server is tagged, automate to revoke customer approval and quickly rotate upstream secrets (your “Disconnect & Rotate” run manual). The postmark event shows why time matters.

Governance consistency

MCP’s focus is separated (as an orchestrator, the server serves as a server with a range of typing capabilities) and Nist’s AI RMF Guidelines for access control, recording and redline evaluation of the generated system, and OWASP’s LLM TOP-10 Emphasize mitigation of rapid injection, unsafe output handling and supply chain vulnerabilities. Use these frameworks to defend controls in security reviews and to impose fixed acceptance criteria for MCP integration.

You can test the current adoption

  • Human/Claude: Product documentation and ecosystem material location MCP as a way for Claude to connect to external tools and data; many community tutorials closely follow three main models of this specification. This provides an off-the-shelf client surface for licensing and recording.
  • Google’s data sharing MCP:release September 24, 2025It standardizes access to public data sets; its announcements and subsequent articles include production usage notes (e.g., a data proxy). Can be used as a stable “source of truth” in Red Team missions.
  • Delinea MCP: Open source servers integrate with Secret Server and Delinea platforms, emphasizing policy-mediated secret access and OAuth alignment with MCP authorization specifications. Practical examples of minimal private tools.

Summary

MCP is no Silver “safety product”. It’s one protocol This provides security and redline practitioners with Stable, enforceable leverage: Limits audience tokens, clear client boundaries, typing tool patterns and the transport of instruments. Use these levers to (1) constraint What can agents do, (2) observe What did they actually do, (3) Replay The confrontation scenario is reliable. Think of MCP servers as privileged connectors (VET, PIN and monitoring them) because the opponent has done it. With these practices, MCP becomes the practical basis of security agent systems and a reliable substrate for red line evaluation.


Resources used in the article

MCP specifications and concepts

MCP Ecosystem (Official)

Security framework

Event: Malicious postmark-mcp server

Referenced Sample MCP Server


Michal Sutter is a data science professional with a master’s degree in data science from the University of Padua. With a solid foundation in statistical analysis, machine learning, and data engineering, Michal excels in transforming complex data sets into actionable insights.

🔥[Recommended Read] NVIDIA AI Open Source VIPE (Video Pose Engine): A powerful and universal 3D video annotation tool for spatial AI

You may also like...