Rule catalog · Tool surface risk

Active probe: read-only tools honor their declared schema

active_probe_read_tools_honor_their_schemamediumweight 4Post-handshake

Authored by Stanley Hong · AgentReserve (founder).

When the operator opts into a deep scan (`deepScan: true` request flag, plus the server-side `AGENTRESERVE_ENABLE_DEEP_SCAN=true` env gate), the scanner attempts up to 3 `tools/call` invocations against tools the classifier marks as READ. Each call uses strictly-typed arguments synthesized from the tool's own `inputSchema` (only enum / const / pattern-constrained string / numeric / boolean primitives — no free-form sensitive fields). The rule passes when every attempted call either succeeded or returned a transport-level error (network, 404, etc.); it fails when at least one tool rejected its own schema-compliant input with a JSON-RPC protocol error. The default scan stays passive — this rule is non-applicable when active probing did not run.

When this rule runs

Requires a successful MCP `initialize` / `tools/list`. Skipped on perimeter-only scans where the server refused or failed the MCP handshake.

Why it matters

A tool that advertises an `inputSchema` is making a contract: arguments matching that schema are valid invocations. A read-tool that rejects its own contract is either misimplemented or actively misleading reviewers — and the discrepancy is invisible to a passive scanner. The active probe is the cheapest defensible test of the contract that doesn't break the no-tools/call invariant on the default code path.

Pass condition

Active probe ran AND every attempted tool call either succeeded or hit a transport-level error.

Fail condition

Active probe ran AND at least one tool returned a JSON-RPC protocol error to schema-compliant input.

Evidence examples

When the rule fails, the report records evidence in roughly this shape:

  • {"hits": [{"toolName": "search", "outcome": "rejected_compliant_input", "detail": "Upstream JSON-RPC error -32602: invalid_params"}]}

Remediation

Either tighten the tool's `inputSchema` so the arguments the scanner synthesized are no longer valid, or fix the tool to accept the documented inputs. A tool that consistently rejects its own contract should be removed from `tools/list` until the contract is right.

Methodology

This rule belongs to the Tool surface risk dimension. What an agent could do if it trusted every advertised tool. Covers destructive actions, credential disclosure, code execution, filesystem mutation, PII handling, prompt-injection-shaped input fields, and injection-bearing tool descriptions — i.e. the agent-specific threat surface, not just generic verb risk.

Read the full methodology for how rules are aggregated into a score, how verdicts are decided, and how hard-fail rules override the aggregate.