Rule catalog · Tool surface risk

No injection vectors in initialize.instructions

initialize_instructions_no_injectionmediumweight 5Post-handshake

Authored by Stanley Hong · AgentReserve (founder).

The `initialize.instructions` string returned by the server contains no markdown link with a `javascript:` or `data:` target, no HTML tag a renderer would execute or fetch (`<script>`, `<img>`, `<iframe>`, `<svg>`, `<object>`, `<embed>`), and no inline URL on a non-HTTPS scheme. The instructions field reaches the model verbatim — same trust boundary as a tool description, same vector classes.

When this rule runs

Requires a successful MCP `initialize` / `tools/list`. Skipped on perimeter-only scans where the server refused or failed the MCP handshake.

Why it matters

MCP servers may return free-form `instructions` on `initialize` to guide the model. That string flows into the agent context with the same trust as a tool description, and the same injection classes apply: executable markdown links, renderable HTML tags, and inline plaintext URLs that an attacker on the network path can replace. None of those belong in instructions.

Pass condition

`initialize.instructions` contains no dangerous markdown link target, no renderable HTML tag, and no non-HTTPS inline URL.

Fail condition

`initialize.instructions` matches at least one of those patterns.

Evidence examples

When the rule fails, the report records evidence in roughly this shape:

  • {"hits": [{"kind": "html_tag", "snippet": "before <script>alert(1)</script> after"}]}

Remediation

Treat `instructions` as a static plain-text help string. Strip HTML, replace plain-HTTP links with HTTPS, and remove `javascript:` / `data:` markdown links.

Methodology

This rule belongs to the Tool surface risk dimension. What an agent could do if it trusted every advertised tool. Covers destructive actions, credential disclosure, code execution, filesystem mutation, PII handling, prompt-injection-shaped input fields, and injection-bearing tool descriptions — i.e. the agent-specific threat surface, not just generic verb risk.

Read the full methodology for how rules are aggregated into a score, how verdicts are decided, and how hard-fail rules override the aggregate.