Files
fermata/docs/security-model.md
T
g4borg 168aefd415 🏗️ fermata: redaction-first security model, unified .botsecrets config
Realign fermata around redaction (PostToolUse) as the primary security
layer, with access control (PreToolUse) as supplementary write/bash
protection. Remove botignore.toml — policy rules now live in .botsecrets
[policy] section. Add fermata.toml as an alias for .botsecrets.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-26 01:10:07 +02:00

8.3 KiB

fermata Security Model

The Problem

AI coding agents see secrets. Not because they try to -- because secrets are everywhere in a working codebase, and agents read files, run commands, and inspect output as part of their normal workflow.

When an agent reads .env, the secret values get tokenized into the LLM's context window. Once there, they can leak into commits, PR descriptions, log messages, or API calls the agent makes. The secret is irrecoverably revealed.

No AI coding agent ships built-in post-read secret filtering today. The entire industry relies on pre-read access controls -- block the file and hope the agent doesn't find the data through another path. This is insufficient.

The Reveal Triangle

Most security models think about secrets in terms of files: can the agent read this file? Can it write to it? fermata introduces a third dimension that traditional models miss entirely.

Read -- Can the agent open a file? Handled by policy rules and file access controls. A blunt instrument: blocking .env also blocks legitimate tooling that needs configuration values. In fermata's model, read access to secret-containing files is allowed by default -- the agent needs to see file structure and key names to reason about configuration.

Write -- Can the agent modify a file? Less concerning than it appears, because version control provides full recovery. Write restrictions matter primarily for anti-jailbreak (preventing the agent from modifying its own hooks or policy files).

Reveal -- Do secret values enter the LLM context? This is the novel concern. No traditional security model addresses it because the concept did not exist before AI agents. A file read that configures a database is not a security event. The same file read that feeds credentials to an LLM is.

These three dimensions are independent. An agent can have read access to .env without the secret values being revealed -- the output is redacted before it reaches the model. This is the default state, not a special configuration: the agent sees file structure, knows which keys exist, can reason about configuration, but never sees actual secret values.

The key insight is that Read and Write operate on file identity (which file). Reveal operates on data content (which values). The reveal problem can only be solved at the data-content level. File identity is necessary but not sufficient.

Defense in Depth

fermata implements a layered security stack. Layers 1 and 2 are the primary defense -- they operate on data content (the actual secret values). Layers 3 and 4 are supplementary -- they operate on file and system identity.

Layer 1: Known-Value Redaction (.botsecrets + Aho-Corasick)

Parse secret-containing files at startup, extract actual secret values, build an Aho-Corasick automaton. Scan all tool output for those exact byte strings and replace them with redaction markers.

This is the primary defense layer. It catches secrets regardless of how they appear in output -- direct file reads, shell command output, log files, error messages. If the value hunter2 is a declared secret, every occurrence in every tool output is redacted before it reaches the model.

Guarantees: Zero false negatives for declared secrets. Sub-millisecond per scan. The Aho-Corasick automaton finds all occurrences in a single linear pass over the output.

Layer 2: Heuristic Detection (Scanner + gitleaks patterns)

Regex patterns for known secret formats: AWS access keys (AKIA...), GitHub PATs (ghp_...), JWTs (eyJ...), database URLs with embedded passwords, and dozens more derived from industry-standard pattern sets.

This is the safety net for secrets not covered by the manifest -- secrets in files that .botsecrets does not know about, secrets generated at runtime, secrets that appear in unexpected places.

Higher false-positive rate than Layer 1, but as a secondary safety net, that tradeoff is acceptable.

Layer 3: Access Control (.botignore / .botsecrets [policy])

Block dangerous write operations and destructive commands. Write .claude/settings.json -- denied. Bash: rm -rf / -- denied. Uses gitignore-style patterns for write protection and command deny-lists for bash safety.

This layer is supplementary -- it protects against destructive writes and dangerous commands, not secret reads. Reading .env is allowed; the secret values never reach the model because Layer 1 redacts them. Write restrictions remain important for anti-jailbreak (preventing the agent from modifying its own hooks or policy files) and for protecting vendored or generated files.

Limitation: Cannot catch indirect access to secret values. An agent running source .env && echo $DB_PASSWORD bypasses file-level controls entirely. This is exactly why access control is supplementary and redaction (Layer 1) is primary.

Layer 4: Structural Containment (External)

Container-level isolation, filesystem restrictions, dropped capabilities, no-new-privileges. Prevents system modification, privilege escalation, and escape from the execution environment. This layer has no concept of secret values -- it operates on structural boundaries.

fermata does not own this layer. It is provided by your container runtime, VM, or sandboxing tool. fermata's design assumes that structural containment exists as the outermost boundary, and focuses on the data-content layers (1 and 2) that containment cannot address.

Design Principles

Fail-open for availability. A parse error in .botsecrets, an unrecognized file format, or a scanner timeout does not block the agent from working. Redaction failures are logged, not fatal. The agent's productivity is not sacrificed for edge-case security failures.

Zero false negatives for declared secrets. If a value appears in the secret manifest (loaded from files matched by .botsecrets), it will be redacted from every tool output, every time. The Aho-Corasick automaton guarantees this -- it finds all occurrences in a single pass at memory-bandwidth speed.

Sub-millisecond performance. Hooks fire on every tool call. fermata must not introduce perceptible latency. Policy evaluation is a hashmap lookup. The Aho-Corasick scan runs at memory-bandwidth speed. Cold start (loading secrets + parsing files) takes roughly 10--20ms; subsequent calls are 1--3ms.

Harness-agnostic. fermata does not assume any specific AI coding agent. The policy engine is pure logic. Harness adapters are thin translation layers. The .botsecrets format works identically whether fermata runs as a hook script, an MCP proxy, or an in-process library.

Single policy file. .botsecrets is the unified configuration interface. One file declares both what to redact ([files], [keys], [heuristic]) and what to restrict ([policy]). How protection is delivered is a deployment concern, not a configuration concern.

The .botsecrets Vision

.botsecrets is designed to be the .gitignore of AI agent security: a simple, declarative, human-readable file that any project can drop in to protect its secrets from AI agents.

The format is harness-agnostic from day one. It declares what to protect, not how. The "how" is determined by the delivery mode -- hook script, MCP proxy, or library. This means the same .botsecrets works with Claude Code, Codex, Gemini CLI, Cursor, and any future tool that supports lifecycle hooks or MCP.

A typical .botsecrets looks like this:

# Declare files that contain secrets.
# fermata parses them, extracts the values, and redacts those values
# from all tool output before it reaches the model.

[files]
patterns = [".env", ".env.*", "config/credentials.toml", "secrets/*.yaml"]

[keys]
include = ["STRIPE_*", "MY_APP_SIGNING_*"]

[heuristic]
enabled = true

# Access control: write protection and bash safety.
# Reading secret-containing files is allowed -- Layer 1 redacts the values.

[policy.write]
patterns = [".claude/**", "vendor/**", "*.lock"]

[policy.bash]
deny = ["rm -rf /", "curl * | sh"]

No secret values appear in .botsecrets itself. It points to the files that contain them. The secret extraction, automaton construction, and output scanning happen automatically. Built-in key patterns (*_KEY, *_SECRET, *_PASSWORD, *_TOKEN, DATABASE_URL, etc.) handle most projects without custom [keys] configuration.

fermata is the reference implementation. The format is the standard -- fermata is one way to enforce it.