feat(fermata): add secret filtering engine — the security brain

Implement Goals 1–3 and 5 from the reveal-layer security brain goal.
fermata now detects, redacts, and scans for secrets in AI agent tool
output, filling the ecosystem gap where no coding agent filters secrets
post-read.

New core/secrets/ module:
- config.rs: .botsecrets TOML format with hierarchical merge and ~40
  built-in key patterns
- parser.rs: multi-format secret file parser (.env, TOML, YAML, JSON,
  Python assignments, Java properties)
- manifest.rs: file discovery + parsing → known-secrets set
- redactor.rs: Aho-Corasick multi-pattern replacement with 4 styles
- scanner.rs: RegexSet heuristic detection with 35 gitleaks-derived
  patterns (MIT) and Shannon entropy filtering
- patterns.rs: curated rules for AWS, GitHub, Stripe, Slack, JWT, etc.

Hook integration:
- fermata hook --event post-tool-use reads tool output, runs redactor +
  scanner, returns updatedToolOutput for Claude Code
- Backward compatible: --event pre-tool-use (default) unchanged
- Fail-open: errors produce {} and exit 0

Library API:
- Redactor::new(manifest, style).redact(text) → RedactedText
- Scanner::new(config).scan(text) → Vec<Finding>
- Compiles without CLI feature for embedding in other crates

195 tests (130 new), all passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Gabor Körber
2026-05-25 17:29:07 +02:00
parent f77fd73966
commit 087429d275
22 changed files with 4557 additions and 172 deletions
+15 -6
View File
@@ -1,19 +1,26 @@
# Package: dirigent_fermata
Harness-agnostic policy gate for AI coding agents.
Harness-agnostic policy gate and secret filtering engine for AI coding agents.
## Quick Facts
- **Type**: Library + binary (`fermata`)
- **Main Entry**: `src/lib.rs`, `src/bin/fermata.rs`
- **Dependencies**: `ignore`, `toml`, `regex`, `globset`, `serde`, `clap` (cli feature)
- **Status**: v0.1library + CLI + Claude hook adapter
- **Dependencies**: `ignore`, `toml`, `regex`, `globset`, `serde`, `clap` (cli feature), `aho-corasick`, `serde_yaml`
- **Status**: v0.2policy gate + secret filtering engine
## Layering
Three concentric layers; nothing inner imports from anything outer.
- **`core/`** — harness-unaware, transport-unaware, sync. Types (`Op`, `Decision`), `.botignore` walker, `botignore.toml` parser, `Policy::check` / `check_command`, path extraction. Sync, no tokio.
- **`harness/`** — `HarnessAdapter` trait over a normalized `ToolCall`. Each adapter (Claude, future Codex, etc.) lives in its own submodule, feature-gated.
- **`core/secrets/`** — the secret filtering engine:
- `config.rs``.botsecrets` TOML parser and hierarchical resolution (user, project, local override).
- `manifest.rs` — discovers secret-containing files from `.botsecrets` patterns and loads their content for redaction.
- `parser.rs` — multi-format secret file parser (`.env`, TOML, YAML, JSON). Extracts key-value pairs where the value is a secret.
- `patterns.rs` — built-in key name patterns (~30 universal patterns like `*_KEY`, `*_SECRET`, `*_PASSWORD`) and gitleaks-derived regex patterns for heuristic scanning.
- `redactor.rs``Redactor` builds an Aho-Corasick automaton from known secret values and replaces them in arbitrary text. Sub-millisecond performance.
- `scanner.rs``Scanner` applies heuristic regex patterns to detect secrets not covered by the known-value manifest (entropy-based and format-based detection).
- **`harness/`** — `HarnessAdapter` trait over a normalized `ToolCall` (PreToolUse) and `PostToolUsePayload` (PostToolUse). Each adapter (Claude, future Codex, etc.) lives in its own submodule, feature-gated. PostToolUse enables output redaction via `updatedToolOutput` before content enters the LLM context.
- **`bin/fermata.rs`** — only place where `clap`, stdio, and exit codes appear.
## Release Model
@@ -24,11 +31,13 @@ Developed in this monorepo; planned to be exported as a standalone repo in the f
`dirigent_tools` depends on `dirigent_fermata`, never the reverse. Fermata must remain usable as a standalone hook/MCP without dragging in the in-process ACP tool runtime.
## Out of scope (v0.1)
## Out of scope (v0.2)
Codex / Gemini hook adapters, MCP server mode, PostToolUse envelope, `readonly_only` Bash mode, audit log, filesystem watcher. Each is a future task with its own plan.
Codex / Gemini hook adapters, MCP server mode, `readonly_only` Bash mode, audit log, filesystem watcher, context taint tracking. Each is a future task with its own plan.
## See also
- `docs/tools/fermata.md` — Dirigent integration plan
- `docs/workpad/brainstorm/fermata.md` — canonical product spec
- `docs/architecture/fermata-security-philosophy.md` — security philosophy and the reveal triangle
- `.botsecrets` format: `core/secrets/config.rs` — the `.gitignore` of AI agent secret protection