Tool surface unchanged since the previous scan
Authored by Stanley Hong · AgentReserve (founder).
The advertised tool surface (each tool's `name`, `description`, `inputSchema`, and `annotations`) hashes to the same value as the most recent prior scan of this server. A drift in any of those fields is the rug-pull pattern documented by the `postmark-mcp` September 2025 incident: a server changes a tool's behavior or name post-install while the user's existing approval still grants the agent access. The rule is silent on first scans (no prior surface to compare against).
When this rule runs
Requires a successful MCP `initialize` / `tools/list`. Skipped on perimeter-only scans where the server refused or failed the MCP handshake.
Why it matters
Approvals to use a tool are granted against a snapshot of its surface. When the surface changes silently — a `delete_temporary_files` is renamed to `delete_files`, or a `summarize` description gains an exfiltration directive — the original approval no longer reflects what the agent will actually do. There is no in-band MCP signal for this drift today; comparing hashes across scans is the cheapest defensible mitigation.
Pass condition
A prior scan exists for this server and its tool-surface hash matches the current scan's hash exactly.
Fail condition
A prior scan exists and its tool-surface hash differs from the current scan's hash.
Evidence examples
When the rule fails, the report records evidence in roughly this shape:
{"changed": true, "currentHash": "ab12…", "priorHash": "98ef…"}
Remediation
Treat tool-surface drift as a re-review event. Diff the current `tools/list` response against the previous one (the report links the prior scan), validate the change against an operator-known release, and rotate any approvals that depended on the old shape.
Methodology
This rule belongs to the Metadata transparency dimension. Whether the server identifies itself and documents its tools — and whether the advertised identity matches the wire identity (cert CN/SAN, hostname). Operators need a stable name, a version, and an internally consistent identity claim to perform any kind of audit.
Read the full methodology for how rules are aggregated into a score, how verdicts are decided, and how hard-fail rules override the aggregate.