# fermata **A fast, harness-agnostic security layer for AI coding agents.** AI coding agents read files, run commands, and inspect output as part of their normal workflow. When they read `.env`, secret values get tokenized into the LLM's context window -- and from there they can leak into commits, PR descriptions, log messages, or API calls. No AI coding agent ships built-in post-read secret filtering today. fermata fixes that. ## Why Traditional security blocks the file and hopes the agent doesn't find the data through another path. This is insufficient -- secrets appear in shell output, log files, error messages, and indirect reads that bypass any access-control list. fermata operates on two independent levels: - **Policy gate** (PreToolUse) -- `.botignore` blocks reads, writes, and dangerous commands before they execute. Catches ~90% of accidental secret access. - **Secret filtering** (PostToolUse) -- `.botsecrets` redacts secret *values* from tool output before they enter the LLM context. Catches the remaining cases regardless of how secrets appear. The key insight: blocking a file is necessary but not sufficient. The agent can have read access to `.env` without secret values being revealed -- if the output is redacted before it reaches the model. ## Quick Start ### Install ```bash cargo install --path . --features cli ``` ### Protect a project in 30 seconds ```bash # Block direct access to secret files echo ".env" > .botignore # Declare where secrets live -- fermata parses them and redacts values cat > .botsecrets << 'EOF' [files] patterns = [".env", ".env.*", "secrets.*"] EOF ``` ### Wire into Claude Code Add both hooks in `.claude/settings.json`: ```json { "hooks": { "PreToolUse": [ { "matcher": "Bash|Read|Edit|Write", "hooks": [ { "type": "command", "command": "fermata hook --harness claude" } ] } ], "PostToolUse": [ { "matcher": "Bash|Read|Edit|Write", "hooks": [ { "type": "command", "command": "fermata hook --harness claude --event post-tool-use" } ] } ] } } ``` That's it. PreToolUse blocks forbidden operations. PostToolUse redacts secret values from tool output before they reach the LLM. ## How It Works fermata interposes on every tool call in the agent's lifecycle: ``` Agent wants to run a tool | PreToolUse ── fermata checks .botignore / botignore.toml | blocked? → deny with reason | allowed? ↓ Tool executes | PostToolUse ── fermata scans output for secret values | found? → replace with ***** before LLM sees it | Clean output enters LLM context ``` Three layers of defense, each independent: | Layer | Mechanism | What it catches | |-------|-----------|-----------------| | **Access control** | `.botignore` rules block tool calls by path | Direct reads/writes to sensitive files | | **Known-value redaction** | `.botsecrets` declares secret files; fermata parses them and builds an Aho-Corasick automaton | Every occurrence of a declared secret value, in any tool output, regardless of source | | **Heuristic detection** | Regex patterns from gitleaks detect undeclared secrets (AWS keys, JWTs, GitHub PATs, database URLs) | Secrets not covered by the manifest -- runtime-generated, unexpected locations | Performance: ~1-5ms per tool call. Cold start (loading config + parsing secret files) is ~10-20ms. ## Configuration Three files, each optional, each solving a different problem: ### `.botignore` -- the 80% case Gitignore syntax. Blocks both reads and writes. Onboarding is one line. ```gitignore .env .env.* secrets/** ``` ### `botignore.toml` -- per-operation rules Separate namespaces so the same file can be readable but not writable: ```toml [read] patterns = [".env*", "secrets/**"] [write] patterns = ["vendor/**", "*.lock"] [bash] deny = ["rm -rf /", "curl * | sh"] ``` ### `.botsecrets` -- secret value redaction Declares which files contain secrets. fermata parses them, extracts values, and redacts every occurrence in tool output. ```toml [files] patterns = [".env", ".env.*", "secrets.*"] [keys] include = ["STRIPE_*", "MY_APP_SIGNING_*"] [heuristic] enabled = true ``` Built-in key patterns (`*_KEY`, `*_SECRET`, `*_PASSWORD`, `*_TOKEN`, `DATABASE_URL`, etc.) handle most projects without custom configuration. See [docs/configuration.md](docs/configuration.md) for the full reference. ## Commands ```bash # Check if a path is allowed fermata check --op read /path/to/.env # exit 1 = blocked fermata check --op write src/main.rs # exit 0 = allowed # Run as a hook (reads harness JSON from stdin) fermata hook --harness claude fermata hook --harness claude --event post-tool-use ``` See [docs/commands.md](docs/commands.md) for the full CLI reference. ## Library API fermata is also a Rust library: ```rust use dirigent_fermata::core::secrets::{Manifest, Redactor, Scanner, SecretsConfig}; // Load .botsecrets and build the redaction manifest let config = SecretsConfig::load("/path/to/project")?; let manifest = Manifest::discover(&config)?; // Known-value redaction (Aho-Corasick, sub-millisecond) let redactor = Redactor::from_manifest(&manifest); let clean = redactor.redact("DB_PASSWORD=hunter2"); // -> "DB_PASSWORD=*****" // Heuristic scanning (regex patterns) let scanner = Scanner::new(&config); let findings = scanner.scan("Found key: AKIA1234567890ABCDEF"); // -> [Finding { pattern: "AWS Access Key", confidence: High, .. }] ``` ## Security Model fermata addresses a novel security concern: **reveal** -- whether secret *values* enter the LLM context. Traditional file-level access control operates on file identity (which file). Secret redaction operates on data content (which values). The reveal problem can only be solved at the data-content level. Read [docs/security-model.md](docs/security-model.md) for the full analysis, including the Reveal Triangle and defense-in-depth architecture. ## Threat Model fermata is a heuristic guard, not a sandbox. It defends against statistical agent behavior and prompt-driven mistakes -- not a deliberate adversary. This is a strength: the threat model is well-defined, and the boundaries are documented honestly. Read [docs/threat-model.md](docs/threat-model.md) for what fermata catches, what it doesn't, and what to combine it with. ## Harness Support | Harness | Status | Mechanism | |---------|--------|-----------| | Claude Code | Shipped | PreToolUse + PostToolUse hooks | | Codex CLI | Planned | Pre-exec hook adapter | | Gemini CLI | Planned | MCP server mode | | Any MCP agent | Planned | MCP proxy wrapping existing servers | The policy engine and redaction logic are identical across all modes. Only the I/O adapter changes. ## Status v0.2 -- policy gate and secret filtering engine are production-ready. All core components are implemented and tested: - `.botignore` walker with gitignore semantics - `botignore.toml` with read/write/bash namespaces - Claude Code PreToolUse and PostToolUse adapters - `.botsecrets` config, manifest discovery, multi-format parser (.env, TOML, YAML, JSON) - Aho-Corasick known-value redactor - Heuristic scanner with gitleaks-derived patterns ## The `.botsecrets` Vision `.botsecrets` is designed to be the **`.gitignore` of AI agent security**: a simple, declarative, human-readable file that every project can drop in to protect its secrets from AI agents. The format is harness-agnostic from day one. It declares *what* to protect, not *how*. The same `.botsecrets` works with Claude Code, Codex, Gemini, and any future harness that supports tool lifecycle hooks. ## License Licensed under either of [Apache License, Version 2.0](LICENSE-APACHE) or [MIT License](LICENSE-MIT) at your option.