🏗️ fermata: redaction-first security model, unified .botsecrets config

Realign fermata around redaction (PostToolUse) as the primary security
layer, with access control (PreToolUse) as supplementary write/bash
protection. Remove botignore.toml — policy rules now live in .botsecrets
[policy] section. Add fermata.toml as an alias for .botsecrets.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-26 01:10:07 +02:00
parent 77520819f6
commit 168aefd415
17 changed files with 571 additions and 423 deletions
+51 -51
View File
@@ -2,18 +2,20 @@
**A fast, harness-agnostic security layer for AI coding agents.**
AI coding agents read files, run commands, and inspect output as part of their normal workflow. When they read `.env`, secret values get tokenized into the LLM's context window -- and from there they can leak into commits, PR descriptions, log messages, or API calls. No AI coding agent ships built-in post-read secret filtering today. fermata fixes that.
AI coding agents read files, run commands, and inspect output as part of their normal workflow. When they read `.env`, secret values get tokenized into the LLM's context window -- and from there they can leak into commits, PR descriptions, log messages, or API calls. The solution is not blocking the read -- the agent needs to see config structure and key names to reason about your project. The solution is **redacting secret values from the output before they reach the model**. No AI coding agent ships built-in post-read secret filtering today. fermata fixes that.
## Why
Traditional security blocks the file and hopes the agent doesn't find the data through another path. This is insufficient -- secrets appear in shell output, log files, error messages, and indirect reads that bypass any access-control list.
Blocking reads is the wrong approach. The agent needs to see file structure. It needs to know which keys exist in `.env`, what your database config looks like, how your secrets are organized. What it does *not* need to see is the actual secret values. An agent can have full read access to `.env` without secret values being revealed -- if the output is redacted before it reaches the model.
fermata operates on two independent levels:
- **Policy gate** (PreToolUse) -- `.botignore` blocks reads, writes, and dangerous commands before they execute. Catches ~90% of accidental secret access.
- **Secret filtering** (PostToolUse) -- `.botsecrets` redacts secret *values* from tool output before they enter the LLM context. Catches the remaining cases regardless of how secrets appear.
- **Secret filtering** (PostToolUse) -- `.botsecrets` declares where secrets live; fermata parses them, builds an Aho-Corasick automaton, and redacts secret *values* from tool output before they enter the LLM context. This is the primary defense. It catches secrets regardless of how they appear -- direct reads, shell output, log files, error messages.
- **Policy gate** (PreToolUse) -- `.botsecrets [policy]` / `.botignore` blocks dangerous writes and destructive commands before they execute. Supplementary protection for write safety and anti-jailbreak.
The key insight: blocking a file is necessary but not sufficient. The agent can have read access to `.env` without secret values being revealed -- if the output is redacted before it reaches the model.
The key insight: file-level access control operates on file identity (*which file*). Secret redaction operates on data content (*which values*). The reveal problem can only be solved at the data-content level.
> **Note:** fermata also accepts `fermata.toml` as an alias for `.botsecrets` (same format, `.botsecrets` takes priority when both exist).
## Quick Start
@@ -26,16 +28,21 @@ cargo install --path . --features cli
### Protect a project in 30 seconds
```bash
# Block direct access to secret files
echo ".env" > .botignore
# Declare where secrets live -- fermata parses them and redacts values
# Declare where secrets live -- fermata parses them and redacts values from agent output
cat > .botsecrets << 'EOF'
[files]
patterns = [".env", ".env.*", "secrets.*"]
[policy.write]
patterns = [".claude/**", "vendor/**", "*.lock"]
[policy.bash]
deny = ["rm -rf /", "curl * | sh"]
EOF
```
One file. The agent can read `.env` freely -- fermata redacts the secret values from the output before they reach the model. Write protection and bash safety rules live in the same `.botsecrets` under `[policy]`.
### Wire into Claude Code
Add both hooks in `.claude/settings.json`:
@@ -63,7 +70,7 @@ Add both hooks in `.claude/settings.json`:
}
```
That's it. PreToolUse blocks forbidden operations. PostToolUse redacts secret values from tool output before they reach the LLM.
That's it. PostToolUse redacts secret values from tool output before they reach the LLM. PreToolUse blocks forbidden writes and dangerous commands.
## How It Works
@@ -72,13 +79,15 @@ fermata interposes on every tool call in the agent's lifecycle:
```
Agent wants to run a tool
|
PreToolUse ── fermata checks .botignore / botignore.toml
| blocked? → deny with reason
| allowed? ↓
PreToolUse ── .botsecrets [policy] / .botignore
| write blocked? → deny
| bash denied? → deny
| otherwise → allow (including reads of .env!)
|
Tool executes
|
PostToolUse ── fermata scans output for secret values
| found? → replace with ***** before LLM sees it
PostToolUse ── .botsecrets [files] + [keys] + [heuristic]
| secret values found? → redact before LLM sees it
|
Clean output enters LLM context
```
@@ -87,44 +96,17 @@ Three layers of defense, each independent:
| Layer | Mechanism | What it catches |
|-------|-----------|-----------------|
| **Access control** | `.botignore` rules block tool calls by path | Direct reads/writes to sensitive files |
| **Known-value redaction** | `.botsecrets` declares secret files; fermata parses them and builds an Aho-Corasick automaton | Every occurrence of a declared secret value, in any tool output, regardless of source |
| **Heuristic detection** | Regex patterns from gitleaks detect undeclared secrets (AWS keys, JWTs, GitHub PATs, database URLs) | Secrets not covered by the manifest -- runtime-generated, unexpected locations |
| **Access control** | `.botsecrets [policy]` / `.botignore` rules block writes and dangerous commands | Destructive writes, anti-jailbreak (agent modifying its own hooks), dangerous shell commands |
Performance: ~1-5ms per tool call. Cold start (loading config + parsing secret files) is ~10-20ms.
## Configuration
Three files, each optional, each solving a different problem:
### `.botsecrets` -- the primary (and usually only) config
### `.botignore` -- the 80% case
Gitignore syntax. Blocks both reads and writes. Onboarding is one line.
```gitignore
.env
.env.*
secrets/**
```
### `botignore.toml` -- per-operation rules
Separate namespaces so the same file can be readable but not writable:
```toml
[read]
patterns = [".env*", "secrets/**"]
[write]
patterns = ["vendor/**", "*.lock"]
[bash]
deny = ["rm -rf /", "curl * | sh"]
```
### `.botsecrets` -- secret value redaction
Declares which files contain secrets. fermata parses them, extracts values, and redacts every occurrence in tool output.
`.botsecrets` is the unified configuration file. It declares both what to redact and what to restrict:
```toml
[files]
@@ -135,10 +117,28 @@ include = ["STRIPE_*", "MY_APP_SIGNING_*"]
[heuristic]
enabled = true
# Access control: write protection and bash safety.
# Reading secret-containing files is allowed -- Layer 1 redacts the values.
[policy.write]
patterns = [".claude/**", "vendor/**", "*.lock"]
[policy.bash]
deny = ["rm -rf /", "curl * | sh"]
```
Built-in key patterns (`*_KEY`, `*_SECRET`, `*_PASSWORD`, `*_TOKEN`, `DATABASE_URL`, etc.) handle most projects without custom configuration.
### `.botignore` -- optional simple layer
Gitignore syntax. For projects that want a minimal, familiar format for write protection. Complements `.botsecrets` but is not required.
```gitignore
vendor/**
*.lock
```
See [docs/configuration.md](docs/configuration.md) for the full reference.
## Commands
@@ -202,20 +202,20 @@ The policy engine and redaction logic are identical across all modes. Only the I
## Status
v0.2 -- policy gate and secret filtering engine are production-ready. All core components are implemented and tested:
v0.2 -- secret filtering engine and policy gate are production-ready. All core components are implemented and tested:
- `.botignore` walker with gitignore semantics
- `botignore.toml` with read/write/bash namespaces
- Claude Code PreToolUse and PostToolUse adapters
- `.botsecrets` config, manifest discovery, multi-format parser (.env, TOML, YAML, JSON)
- `.botsecrets` config with `[files]`, `[keys]`, `[heuristic]`, and `[policy]` sections
- Aho-Corasick known-value redactor
- Heuristic scanner with gitleaks-derived patterns
- Manifest discovery, multi-format parser (.env, TOML, YAML, JSON)
- Claude Code PreToolUse and PostToolUse adapters
- `.botignore` walker with gitignore semantics
## The `.botsecrets` Vision
`.botsecrets` is designed to be the **`.gitignore` of AI agent security**: a simple, declarative, human-readable file that every project can drop in to protect its secrets from AI agents.
The format is harness-agnostic from day one. It declares *what* to protect, not *how*. The same `.botsecrets` works with Claude Code, Codex, Gemini, and any future harness that supports tool lifecycle hooks.
The format is harness-agnostic from day one. It declares *what* to protect, not *how*. One file covers both redaction (`[files]`, `[keys]`, `[heuristic]`) and access control (`[policy]`). The same `.botsecrets` works with Claude Code, Codex, Gemini, and any future harness that supports tool lifecycle hooks.
## License