🏗️ fermata: redaction-first security model, unified .botsecrets config

Realign fermata around redaction (PostToolUse) as the primary security layer, with access control (PreToolUse) as supplementary write/bash protection. Remove botignore.toml — policy rules now live in .botsecrets [policy] section. Add fermata.toml as an alias for .botsecrets. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-26 01:10:07 +02:00
parent 77520819f6
commit 168aefd415
17 changed files with 571 additions and 423 deletions
@@ -2,18 +2,20 @@

 **A fast, harness-agnostic security layer for AI coding agents.**

-AI coding agents read files, run commands, and inspect output as part of their normal workflow. When they read `.env`, secret values get tokenized into the LLM's context window -- and from there they can leak into commits, PR descriptions, log messages, or API calls. No AI coding agent ships built-in post-read secret filtering today. fermata fixes that.
+AI coding agents read files, run commands, and inspect output as part of their normal workflow. When they read `.env`, secret values get tokenized into the LLM's context window -- and from there they can leak into commits, PR descriptions, log messages, or API calls. The solution is not blocking the read -- the agent needs to see config structure and key names to reason about your project. The solution is **redacting secret values from the output before they reach the model**. No AI coding agent ships built-in post-read secret filtering today. fermata fixes that.

 ## Why

-Traditional security blocks the file and hopes the agent doesn't find the data through another path. This is insufficient -- secrets appear in shell output, log files, error messages, and indirect reads that bypass any access-control list.
+Blocking reads is the wrong approach. The agent needs to see file structure. It needs to know which keys exist in `.env`, what your database config looks like, how your secrets are organized. What it does *not* need to see is the actual secret values. An agent can have full read access to `.env` without secret values being revealed -- if the output is redacted before it reaches the model.

 fermata operates on two independent levels:

- **Policy gate** (PreToolUse) -- `.botignore` blocks reads, writes, and dangerous commands before they execute. Catches ~90% of accidental secret access.
- **Secret filtering** (PostToolUse) -- `.botsecrets` redacts secret *values* from tool output before they enter the LLM context. Catches the remaining cases regardless of how secrets appear.
+- **Secret filtering** (PostToolUse) -- `.botsecrets` declares where secrets live; fermata parses them, builds an Aho-Corasick automaton, and redacts secret *values* from tool output before they enter the LLM context. This is the primary defense. It catches secrets regardless of how they appear -- direct reads, shell output, log files, error messages.
+- **Policy gate** (PreToolUse) -- `.botsecrets [policy]` / `.botignore` blocks dangerous writes and destructive commands before they execute. Supplementary protection for write safety and anti-jailbreak.

-The key insight: blocking a file is necessary but not sufficient. The agent can have read access to `.env` without secret values being revealed -- if the output is redacted before it reaches the model.
+The key insight: file-level access control operates on file identity (*which file*). Secret redaction operates on data content (*which values*). The reveal problem can only be solved at the data-content level.
+
+> **Note:** fermata also accepts `fermata.toml` as an alias for `.botsecrets` (same format, `.botsecrets` takes priority when both exist).

 ## Quick Start

@@ -26,16 +28,21 @@ cargo install --path . --features cli
 ### Protect a project in 30 seconds

 ```bash
-# Block direct access to secret files
-echo ".env" > .botignore
-
-# Declare where secrets live -- fermata parses them and redacts values
+# Declare where secrets live -- fermata parses them and redacts values from agent output
 cat > .botsecrets << 'EOF'
 [files]
 patterns = [".env", ".env.*", "secrets.*"]
+
+[policy.write]
+patterns = [".claude/**", "vendor/**", "*.lock"]
+
+[policy.bash]
+deny = ["rm -rf /", "curl * | sh"]
 EOF
 ```

+One file. The agent can read `.env` freely -- fermata redacts the secret values from the output before they reach the model. Write protection and bash safety rules live in the same `.botsecrets` under `[policy]`.
+
 ### Wire into Claude Code

 Add both hooks in `.claude/settings.json`:
@@ -63,7 +70,7 @@ Add both hooks in `.claude/settings.json`:
 }
 ```

-That's it. PreToolUse blocks forbidden operations. PostToolUse redacts secret values from tool output before they reach the LLM.
+That's it. PostToolUse redacts secret values from tool output before they reach the LLM. PreToolUse blocks forbidden writes and dangerous commands.

 ## How It Works

@@ -72,13 +79,15 @@ fermata interposes on every tool call in the agent's lifecycle:
 ```
 Agent wants to run a tool
        |
-   PreToolUse ── fermata checks .botignore / botignore.toml
-        |            blocked? → deny with reason
-        |            allowed? ↓
+   PreToolUse ── .botsecrets [policy] / .botignore
+        |            write blocked? → deny
+        |            bash denied? → deny
+        |            otherwise → allow (including reads of .env!)
+        |
   Tool executes
        |
-   PostToolUse ── fermata scans output for secret values
-        |            found? → replace with ***** before LLM sees it
+   PostToolUse ── .botsecrets [files] + [keys] + [heuristic]
+        |            secret values found? → redact before LLM sees it
        |
   Clean output enters LLM context
 ```
@@ -87,44 +96,17 @@ Three layers of defense, each independent:

 | Layer | Mechanism | What it catches |
 |-------|-----------|-----------------|
-| **Access control** | `.botignore` rules block tool calls by path | Direct reads/writes to sensitive files |
 | **Known-value redaction** | `.botsecrets` declares secret files; fermata parses them and builds an Aho-Corasick automaton | Every occurrence of a declared secret value, in any tool output, regardless of source |
 | **Heuristic detection** | Regex patterns from gitleaks detect undeclared secrets (AWS keys, JWTs, GitHub PATs, database URLs) | Secrets not covered by the manifest -- runtime-generated, unexpected locations |
+| **Access control** | `.botsecrets [policy]` / `.botignore` rules block writes and dangerous commands | Destructive writes, anti-jailbreak (agent modifying its own hooks), dangerous shell commands |

 Performance: ~1-5ms per tool call. Cold start (loading config + parsing secret files) is ~10-20ms.

 ## Configuration

-Three files, each optional, each solving a different problem:
+### `.botsecrets` -- the primary (and usually only) config

-### `.botignore` -- the 80% case
-
-Gitignore syntax. Blocks both reads and writes. Onboarding is one line.
-
-```gitignore
-.env
-.env.*
-secrets/**
-```
-
-### `botignore.toml` -- per-operation rules
-
-Separate namespaces so the same file can be readable but not writable:
-
-```toml
-[read]
-patterns = [".env*", "secrets/**"]
-
-[write]
-patterns = ["vendor/**", "*.lock"]
-
-[bash]
-deny = ["rm -rf /", "curl * | sh"]
-```
-
-### `.botsecrets` -- secret value redaction
-
-Declares which files contain secrets. fermata parses them, extracts values, and redacts every occurrence in tool output.
+`.botsecrets` is the unified configuration file. It declares both what to redact and what to restrict:

 ```toml
 [files]
@@ -135,10 +117,28 @@ include = ["STRIPE_*", "MY_APP_SIGNING_*"]

 [heuristic]
 enabled = true
+
+# Access control: write protection and bash safety.
+# Reading secret-containing files is allowed -- Layer 1 redacts the values.
+
+[policy.write]
+patterns = [".claude/**", "vendor/**", "*.lock"]
+
+[policy.bash]
+deny = ["rm -rf /", "curl * | sh"]
 ```

 Built-in key patterns (`*_KEY`, `*_SECRET`, `*_PASSWORD`, `*_TOKEN`, `DATABASE_URL`, etc.) handle most projects without custom configuration.

+### `.botignore` -- optional simple layer
+
+Gitignore syntax. For projects that want a minimal, familiar format for write protection. Complements `.botsecrets` but is not required.
+
+```gitignore
+vendor/**
+*.lock
+```
+
 See [docs/configuration.md](docs/configuration.md) for the full reference.

 ## Commands
@@ -202,20 +202,20 @@ The policy engine and redaction logic are identical across all modes. Only the I

 ## Status

-v0.2 -- policy gate and secret filtering engine are production-ready. All core components are implemented and tested:
+v0.2 -- secret filtering engine and policy gate are production-ready. All core components are implemented and tested:

- `.botignore` walker with gitignore semantics
- `botignore.toml` with read/write/bash namespaces
- Claude Code PreToolUse and PostToolUse adapters
- `.botsecrets` config, manifest discovery, multi-format parser (.env, TOML, YAML, JSON)
+- `.botsecrets` config with `[files]`, `[keys]`, `[heuristic]`, and `[policy]` sections
 - Aho-Corasick known-value redactor
 - Heuristic scanner with gitleaks-derived patterns
+- Manifest discovery, multi-format parser (.env, TOML, YAML, JSON)
+- Claude Code PreToolUse and PostToolUse adapters
+- `.botignore` walker with gitignore semantics

 ## The `.botsecrets` Vision

 `.botsecrets` is designed to be the **`.gitignore` of AI agent security**: a simple, declarative, human-readable file that every project can drop in to protect its secrets from AI agents.

-The format is harness-agnostic from day one. It declares *what* to protect, not *how*. The same `.botsecrets` works with Claude Code, Codex, Gemini, and any future harness that supports tool lifecycle hooks.
+The format is harness-agnostic from day one. It declares *what* to protect, not *how*. One file covers both redaction (`[files]`, `[keys]`, `[heuristic]`) and access control (`[policy]`). The same `.botsecrets` works with Claude Code, Codex, Gemini, and any future harness that supports tool lifecycle hooks.

 ## License