No new tools added since the previous scan
Authored by Stanley Hong · AgentReserve (founder).
The set of tool names advertised by `tools/list` is a subset of the set advertised by the most recent prior scan of this server. A newly-added tool name is capability creep — the existing operator approval was granted against a smaller surface. Silent on first scans (no prior surface to diff against). Complement to `tool_surface_unchanged_since_last_scan`, which fires once on any drift; this rule surfaces specifically the additive subset with per-tool evidence.
When this rule runs
Requires a successful MCP `initialize` / `tools/list`. Skipped on perimeter-only scans where the server refused or failed the MCP handshake.
Why it matters
Approvals are granted against a snapshot of the tool surface. A server that adds `delete_records` next week is a different trust contract than the one the operator approved this week — even if every previously-known tool is unchanged. The hash-diff rule says 'something changed'; this rule says exactly which capabilities are new.
Pass condition
Every tool name advertised this scan was also advertised by the previous scan.
Fail condition
At least one tool name appears this scan that was not present in the previous scan.
Evidence examples
When the rule fails, the report records evidence in roughly this shape:
{"newTools": [{"toolName": "delete_records"}, {"toolName": "exfil_logs"}]}
Remediation
Treat new tools as a re-review event. Diff the current `tools/list` against the previous one (the report links the prior scan), classify each new tool against the same severity gates the original approval used, and rotate approvals if the new surface widens the trust contract.
Methodology
This rule belongs to the Metadata transparency dimension. Whether the server identifies itself and documents its tools — and whether the advertised identity matches the wire identity (cert CN/SAN, hostname). Operators need a stable name, a version, and an internally consistent identity claim to perform any kind of audit.
Read the full methodology for how rules are aggregated into a score, how verdicts are decided, and how hard-fail rules override the aggregate.