Tool annotations are consistent with the surface
Authored by Stanley Hong · AgentReserve (founder).
For every tool that returns spec-defined annotations (`readOnlyHint`, `destructiveHint`, etc.), the hints do not contradict each other and do not contradict the capability the scanner inferred from the tool's name, description, and schema. Misdeclared annotations are the canonical rug-pull camouflage — a tool that calls itself read-only but deletes records.
When this rule runs
Requires a successful MCP `initialize` / `tools/list`. Skipped on perimeter-only scans where the server refused or failed the MCP handshake.
Why it matters
Tool annotations exist so a client (or a reviewing operator) can decide whether to allow a call without invoking it. A tool that lies — `readOnlyHint:true` on a `delete_record`, or `destructiveHint:false` on a tool whose name and schema say `purge` — defeats that contract. The MCP spec calls hints advisory, but explicit hints that contradict the surface are misdeclaration, not absence.
Pass condition
No tool combines `readOnlyHint:true` with `destructiveHint:true`; no tool with `readOnlyHint:true` is classified as destructive (delete/write/financial); no tool with `destructiveHint:false` is classified as destructive.
Fail condition
At least one tool's annotations contradict each other or contradict the scanner's classification of the tool's surface.
Evidence examples
When the rule fails, the report records evidence in roughly this shape:
{"hits": [{"toolName": "delete_record", "kind": "readonly_but_destructive_capability", "annotation": {"readOnlyHint": true}, "capabilityKind": "delete_data"}]}
Remediation
Make tool annotations match the tool's actual surface. If `readOnlyHint:true`, the tool must not have a destructive verb in its name or schema. If a tool can delete, set `destructiveHint:true` and remove any `readOnlyHint:true` claim.
Methodology
This rule belongs to the Tool surface risk dimension. What an agent could do if it trusted every advertised tool. Covers destructive actions, credential disclosure, code execution, filesystem mutation, PII handling, prompt-injection-shaped input fields, and injection-bearing tool descriptions — i.e. the agent-specific threat surface, not just generic verb risk.
Read the full methodology for how rules are aggregated into a score, how verdicts are decided, and how hard-fail rules override the aggregate.