Files
fermata/README.md
T
Gabor Körber 087429d275 feat(fermata): add secret filtering engine — the security brain
Implement Goals 1–3 and 5 from the reveal-layer security brain goal.
fermata now detects, redacts, and scans for secrets in AI agent tool
output, filling the ecosystem gap where no coding agent filters secrets
post-read.

New core/secrets/ module:
- config.rs: .botsecrets TOML format with hierarchical merge and ~40
  built-in key patterns
- parser.rs: multi-format secret file parser (.env, TOML, YAML, JSON,
  Python assignments, Java properties)
- manifest.rs: file discovery + parsing → known-secrets set
- redactor.rs: Aho-Corasick multi-pattern replacement with 4 styles
- scanner.rs: RegexSet heuristic detection with 35 gitleaks-derived
  patterns (MIT) and Shannon entropy filtering
- patterns.rs: curated rules for AWS, GitHub, Stripe, Slack, JWT, etc.

Hook integration:
- fermata hook --event post-tool-use reads tool output, runs redactor +
  scanner, returns updatedToolOutput for Claude Code
- Backward compatible: --event pre-tool-use (default) unchanged
- Fail-open: errors produce {} and exit 0

Library API:
- Redactor::new(manifest, style).redact(text) → RedactedText
- Scanner::new(config).scan(text) → Vec<Finding>
- Compiles without CLI feature for embedding in other crates

195 tests (130 new), all passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-25 17:29:07 +02:00

192 lines
5.9 KiB
Markdown

# dirigent_fermata
**A fast, harness-agnostic policy gate and secret filtering engine for AI coding agents.**
Drop a `.botignore` to control what your agent can touch. Drop a `.botsecrets` to control what secret values your agent can see. Fermata enforces both -- before and after tool calls happen.
---
## Why Fermata
AI coding agents don't have an innate sense of "don't touch `.env`" -- and even if you block the file, they can still see its contents through shell output, log files, and indirect reads. Fermata solves both problems:
- **Policy gate** -- `.botignore` blocks reads, writes, and dangerous commands before they execute (PreToolUse).
- **Secret filtering** -- `.botsecrets` redacts secret values from tool output before they enter the LLM context (PostToolUse).
- **Fast** -- Rust, Aho-Corasick automaton for redaction, ~1-5ms per call.
- **Familiar syntax** -- `.botignore` uses gitignore rules; `.botsecrets` uses TOML with glob patterns.
- **Harness-agnostic** -- hook adapters for Claude Code (shipped), Codex and Gemini (planned), MCP proxy (planned).
---
## Status: v0.2
| Component | Status |
|-----------|--------|
| Library (`Policy::check`, `Policy::check_command`) | Done |
| `.botignore` walker (gitignore semantics) | Done |
| `botignore.toml` parser (read / write / bash namespaces) | Done |
| CLI: `fermata check` / `fermata hook` | Done |
| Claude Code PreToolUse adapter | Done |
| Claude Code PostToolUse adapter (output redaction) | Done |
| `.botsecrets` config parser | Done |
| Secret manifest discovery and loading | Done |
| Multi-format secret file parser (.env, TOML, YAML, JSON) | Done |
| `Redactor` (known-value Aho-Corasick replacement) | Done |
| `Scanner` (heuristic regex + gitleaks patterns) | Done |
Out of scope for v0.2: Codex / Gemini hook adapters, MCP proxy mode, audit log, filesystem watcher.
---
## Install
From source (this monorepo):
```bash
cargo install --path crates/dirigent_fermata --features cli
```
---
## Secret Filtering
Fermata's secret filtering operates in three layers:
1. **Policy gate** (PreToolUse) -- `.botignore` blocks direct access to sensitive files. Catches ~90% of accidental reads.
2. **Known-value redaction** (PostToolUse) -- `.botsecrets` declares which files contain secrets. Fermata parses them, extracts values, and replaces them in all tool output using an Aho-Corasick automaton. Zero false negatives for declared secrets.
3. **Heuristic scanning** (PostToolUse) -- regex patterns derived from gitleaks detect undeclared secrets (AWS keys, JWTs, GitHub PATs, database URLs). Safety net for secrets not covered by the manifest.
### `.botsecrets` format
Create a `.botsecrets` file at your project root:
```toml
# Files that contain secrets -- fermata parses these and redacts values
[files]
patterns = [".env", ".env.*", "secrets.*"]
# Additional secret key names (built-in defaults cover *_KEY, *_SECRET, etc.)
[keys]
include = ["STRIPE_*", "MY_APP_SIGNING_*"]
# Heuristic scanning on all tool output
[heuristic]
enabled = true
```
That's the typical case. Built-in key patterns (`*_KEY`, `*_SECRET`, `*_PASSWORD`, `*_TOKEN`, `DATABASE_URL`, etc.) handle most projects without custom configuration.
---
## Usage
### Claude Code hook configuration
Add both PreToolUse and PostToolUse hooks in `.claude/settings.json`:
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash|Read|Edit|Write",
"hooks": [
{ "type": "command", "command": "fermata hook --harness claude" }
]
}
],
"PostToolUse": [
{
"matcher": "Bash|Read|Edit|Write",
"hooks": [
{ "type": "command", "command": "fermata hook --harness claude --event post-tool-use" }
]
}
]
}
}
```
PreToolUse blocks forbidden operations. PostToolUse redacts secret values from tool output before they reach the LLM.
### Checking a path
```bash
fermata check --op read /path/to/.env
# exit 1 -- blocked
fermata check --op write /path/to/src/main.rs
# exit 0 -- allowed
```
### Library API
```rust
use dirigent_fermata::core::secrets::{Manifest, Redactor, Scanner, SecretsConfig};
// Load .botsecrets config and build the manifest
let config = SecretsConfig::load("/path/to/project")?;
let manifest = Manifest::discover(&config)?;
// Known-value redaction (Aho-Corasick, sub-millisecond)
let redactor = Redactor::from_manifest(&manifest);
let clean = redactor.redact("DB_PASSWORD=hunter2\nAPI_KEY=sk-abc123");
// -> "DB_PASSWORD=*****\nAPI_KEY=*****"
// Heuristic scanning (regex patterns)
let scanner = Scanner::new(&config);
let findings = scanner.scan("Found key: AKIA1234567890ABCDEF");
// -> [Finding { pattern: "AWS Access Key", confidence: High, .. }]
```
---
## Configuration
### `.botignore` -- access control
Gitignore syntax. Blocks both reads and writes.
```gitignore
.env
.env.*
secrets/**
```
### `botignore.toml` -- per-operation rules
```toml
[read]
patterns = [".env*", "secrets/**"]
[write]
patterns = ["vendor/**", "*.lock"]
[bash]
deny = ["rm -rf /", "curl * | sh"]
```
### `.botsecrets` -- secret value redaction
See the Secret Filtering section above.
---
## Architecture
Three concentric layers; nothing inner imports from anything outer:
- **`core/`** -- harness-unaware, sync. Policy types, `.botignore` walker, `botignore.toml` parser, `Policy::check`.
- **`core/secrets/`** -- `.botsecrets` config, manifest discovery, multi-format parser, Aho-Corasick redactor, heuristic scanner.
- **`harness/`** -- `HarnessAdapter` trait for PreToolUse (policy gate) and PostToolUse (output redaction). Each adapter is feature-gated.
- **`bin/fermata.rs`** -- `clap`, stdio, and exit codes.
---
## See also
- `docs/tools/fermata.md` -- Dirigent integration plan
- `docs/architecture/fermata-security-philosophy.md` -- security philosophy and the reveal triangle
- `docs/workpad/brainstorm/fermata.md` -- full product spec and field notes
- `docs/architecture/crates.md` -- crate dependency map