✨ feat(fermata): add secret filtering engine — the security brain

Implement Goals 1–3 and 5 from the reveal-layer security brain goal. fermata now detects, redacts, and scans for secrets in AI agent tool output, filling the ecosystem gap where no coding agent filters secrets post-read. New core/secrets/ module: - config.rs: .botsecrets TOML format with hierarchical merge and ~40 built-in key patterns - parser.rs: multi-format secret file parser (.env, TOML, YAML, JSON, Python assignments, Java properties) - manifest.rs: file discovery + parsing → known-secrets set - redactor.rs: Aho-Corasick multi-pattern replacement with 4 styles - scanner.rs: RegexSet heuristic detection with 35 gitleaks-derived patterns (MIT) and Shannon entropy filtering - patterns.rs: curated rules for AWS, GitHub, Stripe, Slack, JWT, etc. Hook integration: - fermata hook --event post-tool-use reads tool output, runs redactor + scanner, returns updatedToolOutput for Claude Code - Backward compatible: --event pre-tool-use (default) unchanged - Fail-open: errors produce {} and exit 0 Library API: - Redactor::new(manifest, style).redact(text) → RedactedText - Scanner::new(config).scan(text) → Vec<Finding> - Compiles without CLI feature for embedding in other crates 195 tests (130 new), all passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-25 17:29:07 +02:00
parent f77fd73966
commit 087429d275
22 changed files with 4557 additions and 172 deletions
@@ -1,19 +1,26 @@
 # Package: dirigent_fermata

-Harness-agnostic policy gate for AI coding agents.
+Harness-agnostic policy gate and secret filtering engine for AI coding agents.

 ## Quick Facts
 - **Type**: Library + binary (`fermata`)
 - **Main Entry**: `src/lib.rs`, `src/bin/fermata.rs`
- **Dependencies**: `ignore`, `toml`, `regex`, `globset`, `serde`, `clap` (cli feature)
- **Status**: v0.1 — library + CLI + Claude hook adapter
+- **Dependencies**: `ignore`, `toml`, `regex`, `globset`, `serde`, `clap` (cli feature), `aho-corasick`, `serde_yaml`
+- **Status**: v0.2 — policy gate + secret filtering engine

 ## Layering

 Three concentric layers; nothing inner imports from anything outer.

 - **`core/`** — harness-unaware, transport-unaware, sync. Types (`Op`, `Decision`), `.botignore` walker, `botignore.toml` parser, `Policy::check` / `check_command`, path extraction. Sync, no tokio.
- **`harness/`** — `HarnessAdapter` trait over a normalized `ToolCall`. Each adapter (Claude, future Codex, etc.) lives in its own submodule, feature-gated.
+  - **`core/secrets/`** — the secret filtering engine:
+    - `config.rs` — `.botsecrets` TOML parser and hierarchical resolution (user, project, local override).
+    - `manifest.rs` — discovers secret-containing files from `.botsecrets` patterns and loads their content for redaction.
+    - `parser.rs` — multi-format secret file parser (`.env`, TOML, YAML, JSON). Extracts key-value pairs where the value is a secret.
+    - `patterns.rs` — built-in key name patterns (~30 universal patterns like `*_KEY`, `*_SECRET`, `*_PASSWORD`) and gitleaks-derived regex patterns for heuristic scanning.
+    - `redactor.rs` — `Redactor` builds an Aho-Corasick automaton from known secret values and replaces them in arbitrary text. Sub-millisecond performance.
+    - `scanner.rs` — `Scanner` applies heuristic regex patterns to detect secrets not covered by the known-value manifest (entropy-based and format-based detection).
+- **`harness/`** — `HarnessAdapter` trait over a normalized `ToolCall` (PreToolUse) and `PostToolUsePayload` (PostToolUse). Each adapter (Claude, future Codex, etc.) lives in its own submodule, feature-gated. PostToolUse enables output redaction via `updatedToolOutput` before content enters the LLM context.
 - **`bin/fermata.rs`** — only place where `clap`, stdio, and exit codes appear.

 ## Release Model
@@ -24,11 +31,13 @@ Developed in this monorepo; planned to be exported as a standalone repo in the f

 `dirigent_tools` depends on `dirigent_fermata`, never the reverse. Fermata must remain usable as a standalone hook/MCP without dragging in the in-process ACP tool runtime.

-## Out of scope (v0.1)
+## Out of scope (v0.2)

-Codex / Gemini hook adapters, MCP server mode, PostToolUse envelope, `readonly_only` Bash mode, audit log, filesystem watcher. Each is a future task with its own plan.
+Codex / Gemini hook adapters, MCP server mode, `readonly_only` Bash mode, audit log, filesystem watcher, context taint tracking. Each is a future task with its own plan.

 ## See also

 - `docs/tools/fermata.md` — Dirigent integration plan
 - `docs/workpad/brainstorm/fermata.md` — canonical product spec
+- `docs/architecture/fermata-security-philosophy.md` — security philosophy and the reveal triangle
+- `.botsecrets` format: `core/secrets/config.rs` — the `.gitignore` of AI agent secret protection