🏗️ fermata: redaction-first security model, unified .botsecrets config

Realign fermata around redaction (PostToolUse) as the primary security layer, with access control (PreToolUse) as supplementary write/bash protection. Remove botignore.toml — policy rules now live in .botsecrets [policy] section. Add fermata.toml as an alias for .botsecrets. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-26 01:10:07 +02:00
parent 77520819f6
commit 168aefd415
17 changed files with 571 additions and 423 deletions
@@ -16,19 +16,34 @@ This shifts the design centre from "what can a smart attacker hide?" to "what do

 Think of fermata as a guardrail on a mountain road, not an armoured wall. It catches the statistical case -- the distracted driver drifting toward the edge -- not the determined off-roader who drives around it.

+Fermata's primary defence is not path blocking but **value redaction**. Even when a path-level check misses a read (L4 shell indirection, L5 derived names), known secret values are scrubbed from tool output before the LLM sees them. The path-level checks (L0--L3) provide supplementary protection -- they prevent unnecessary file access and dangerous commands, but they are not the last line of defence against secret leakage.
+
 ---

 ## What fermata catches

-Fermata's detection is organised by how the agent names the path or secret it is about to access. Lower levels are easier to detect statically; higher levels require progressively more runtime information.
+Fermata's detection operates on two independent tracks: **value-level redaction** (primary) and **path-level access control** (supplementary).
+
+### Secret redaction (PostToolUse) -- primary defence
+
+Independent of path-level detection, fermata filters tool output before it enters the LLM context:
+
+- **Known-value redaction** -- `.botsecrets` declares which files contain secrets. Fermata parses those files, extracts the secret values, and replaces them in all tool output using an Aho-Corasick automaton. Sub-millisecond performance. Zero false negatives for declared secrets.
+- **Heuristic scanning** -- regex patterns (derived from gitleaks) detect undeclared secrets in tool output: AWS access keys, JWTs, GitHub PATs, database connection strings, and similar high-entropy tokens. This is a safety net for secrets not covered by the manifest.
+
+This is the layer that makes fermata's security model resilient. Even if every path-level check fails -- the agent reaches `.env` through a route fermata didn't parse -- the secret values themselves are scrubbed from the output before the LLM sees them.
+
+### Path-level access control (PreToolUse) -- supplementary
+
+The path-level checks below are organised by how the agent names the path it is about to access. Lower levels are easier to detect statically; higher levels require progressively more runtime information. These checks prevent unnecessary file access and reduce noise, but they are not the last line of defence.

 ### L0: Direct path arguments -- fully covered

-The agent calls a path-typed tool (Read, Write, Edit) with an explicit file path. This is fermata's home turf.
+The agent calls a path-typed tool (Read, Write, Edit) with an explicit file path.

 > `Read({"file_path": "/home/user/.env"})` -- `.botignore` matches -- deny.

-Both `.botignore` (gitignore semantics, deepest directory wins, negation patterns work as in git) and `botignore.toml` `[read]`/`[write]` glob lists are evaluated against the resolved path. This covers the overwhelming majority of accidental secret-file reads, because most agents use direct path tools as their primary file access method.
+Both `.botignore` (gitignore semantics, deepest directory wins, negation patterns work as in git) and `.botsecrets [policy]` rules are evaluated against the resolved path. This covers the overwhelming majority of accidental secret-file reads, because most agents use direct path tools as their primary file access method.

 ### L1: Absolute paths inside Bash commands

@@ -56,15 +71,6 @@ Fermata handles this by checking whether the glob pattern in the command **overl

 The limitation: fermata can tell whether a glob *could* hit a protected file, but not whether it *actually will* in the current directory state. For the heuristic threat model, "could hit" is the right bar.

-### Secret redaction (PostToolUse)
-
-Independent of path-level detection, fermata filters tool output before it enters the LLM context:
-
- **Known-value redaction** -- `.botsecrets` declares which files contain secrets. Fermata parses those files, extracts the secret values, and replaces them in all tool output using an Aho-Corasick automaton. Sub-millisecond performance. Zero false negatives for declared secrets.
- **Heuristic scanning** -- regex patterns (derived from gitleaks) detect undeclared secrets in tool output: AWS access keys, JWTs, GitHub PATs, database connection strings, and similar high-entropy tokens. This is a safety net for secrets not covered by the manifest.
-
-Together these mean that even if a path-level check misses a read (the agent reaches `.env` through a route fermata didn't parse), the secret values themselves are scrubbed from the output before the LLM sees them.
-
 ---

 ## What fermata does NOT catch
@@ -105,6 +111,7 @@ This is not a failure -- it is a design boundary. Trying to solve L6 within a st

 | Level | Description | Detection | Confidence |
 |-------|-------------|-----------|------------|
+| Secret redaction | Known-value + heuristic scanning | Aho-Corasick + regex | High (zero false negatives for declared secrets) |
 | L0 | Direct path argument | Full policy check | High |
 | L1 | Absolute path in Bash | Path extraction + policy | High |
 | L2 | Bare filename in CWD | CWD resolution + policy | Medium |
@@ -113,7 +120,7 @@ This is not a failure -- it is a design boundary. Trying to solve L6 within a st
 | L5 | Derived names | Collapses to L0 at call time | Depends on final call |
 | L6 | Protocol exfiltration | Out of scope | N/A |

-Secret redaction (PostToolUse) operates as a separate, independent layer. Even when path-level detection at L1-L4 misses a read, known-value redaction catches the secret values in the output.
+Secret redaction is the primary defence layer. Path-level detection (L0--L3) provides supplementary access control; even when it misses, redaction catches the secret values in the output.

 ---

@@ -123,10 +130,10 @@ Fermata is one layer in a defence-in-depth stack. Here is what it is good at and

 ### Fermata alone gives you

- **Accidental secret exposure prevention.** The most common case: agent reads `.env`, `secrets.toml`, or a credential file because it matched a directory listing. Fermata blocks this at L0 with zero configuration beyond `.botignore`.
- **Known-secret scrubbing.** Even if a secret leaks through an indirect read, declared secrets are redacted from LLM context. The agent never "learns" the value.
- **Heuristic secret detection.** Undeclared secrets matching common formats (AWS keys, tokens, connection strings) are flagged in tool output.
+- **Known-secret scrubbing.** Declared secrets (via `.botsecrets`) are redacted from all tool output, always. The agent never "learns" the value, regardless of how it reached the file.
+- **Heuristic secret detection.** Undeclared secrets matching common formats (AWS keys, tokens, connection strings) are flagged in tool output. Built-in patterns handle most projects without any configuration.
 - **Command guardrails.** Dangerous shell patterns (`rm -rf /`, `curl | sh`) are caught by configurable deny lists.
+- **Write protection and access control.** Path-level checks (`.botignore`, `.botsecrets [policy]`) prevent unnecessary file access and unauthorised modifications at L0--L3.

 ### Combine fermata with