No credential-access tools in the public surface
Authored by Stanley Hong · AgentReserve (founder).
No advertised tool references credentials, secrets, API keys, tokens, private keys, or vault material. A public tool that returns or mutates authentication material lets any caller pivot to other systems. The check is a keyword scan over each tool's name, description, and input schema; a benignly-named utility like `validate_api_key_format` will trip it. Hard-fail forces `block` precisely so the operator must consciously override the heuristic — either rename the tool, move it behind authentication, or treat the report as advisory and document the exception.
When this rule runs
Requires a successful MCP `initialize` / `tools/list`. Skipped on perimeter-only scans where the server refused or failed the MCP handshake.
Why it matters
A public tool that returns or mutates credentials, API keys, tokens or vault material lets any caller pivot to other systems. There is no benign default reason to expose this from an unauthenticated MCP server.
Pass condition
No tool name, description or schema field references credential, secret, token, key or vault material.
Fail condition
At least one tool surfaces credential-, secret-, token-, key- or vault-related vocabulary.
Evidence examples
When the rule fails, the report records evidence in roughly this shape:
{"matches": [{"toolName": "get_api_key", "keyword": "api_key", "source": "name"}]}
Remediation
Remove credential-handling tools from the public surface. If credentials must be exchanged, do it through an authenticated, audited path that is not exposed via anonymous `tools/list`.
Methodology
This rule belongs to the Tool surface risk dimension. What an agent could do if it trusted every advertised tool. Covers destructive actions, credential disclosure, code execution, filesystem mutation, PII handling, prompt-injection-shaped input fields, and injection-bearing tool descriptions — i.e. the agent-specific threat surface, not just generic verb risk.
Read the full methodology for how rules are aggregated into a score, how verdicts are decided, and how hard-fail rules override the aggregate.