Files
fermata/README.md
T
g4borg 168aefd415 🏗️ fermata: redaction-first security model, unified .botsecrets config
Realign fermata around redaction (PostToolUse) as the primary security
layer, with access control (PreToolUse) as supplementary write/bash
protection. Remove botignore.toml — policy rules now live in .botsecrets
[policy] section. Add fermata.toml as an alias for .botsecrets.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-26 01:10:07 +02:00

223 lines
9.0 KiB
Markdown

# fermata
**A fast, harness-agnostic security layer for AI coding agents.**
AI coding agents read files, run commands, and inspect output as part of their normal workflow. When they read `.env`, secret values get tokenized into the LLM's context window -- and from there they can leak into commits, PR descriptions, log messages, or API calls. The solution is not blocking the read -- the agent needs to see config structure and key names to reason about your project. The solution is **redacting secret values from the output before they reach the model**. No AI coding agent ships built-in post-read secret filtering today. fermata fixes that.
## Why
Blocking reads is the wrong approach. The agent needs to see file structure. It needs to know which keys exist in `.env`, what your database config looks like, how your secrets are organized. What it does *not* need to see is the actual secret values. An agent can have full read access to `.env` without secret values being revealed -- if the output is redacted before it reaches the model.
fermata operates on two independent levels:
- **Secret filtering** (PostToolUse) -- `.botsecrets` declares where secrets live; fermata parses them, builds an Aho-Corasick automaton, and redacts secret *values* from tool output before they enter the LLM context. This is the primary defense. It catches secrets regardless of how they appear -- direct reads, shell output, log files, error messages.
- **Policy gate** (PreToolUse) -- `.botsecrets [policy]` / `.botignore` blocks dangerous writes and destructive commands before they execute. Supplementary protection for write safety and anti-jailbreak.
The key insight: file-level access control operates on file identity (*which file*). Secret redaction operates on data content (*which values*). The reveal problem can only be solved at the data-content level.
> **Note:** fermata also accepts `fermata.toml` as an alias for `.botsecrets` (same format, `.botsecrets` takes priority when both exist).
## Quick Start
### Install
```bash
cargo install --path . --features cli
```
### Protect a project in 30 seconds
```bash
# Declare where secrets live -- fermata parses them and redacts values from agent output
cat > .botsecrets << 'EOF'
[files]
patterns = [".env", ".env.*", "secrets.*"]
[policy.write]
patterns = [".claude/**", "vendor/**", "*.lock"]
[policy.bash]
deny = ["rm -rf /", "curl * | sh"]
EOF
```
One file. The agent can read `.env` freely -- fermata redacts the secret values from the output before they reach the model. Write protection and bash safety rules live in the same `.botsecrets` under `[policy]`.
### Wire into Claude Code
Add both hooks in `.claude/settings.json`:
```json
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash|Read|Edit|Write",
"hooks": [
{ "type": "command", "command": "fermata hook --harness claude" }
]
}
],
"PostToolUse": [
{
"matcher": "Bash|Read|Edit|Write",
"hooks": [
{ "type": "command", "command": "fermata hook --harness claude --event post-tool-use" }
]
}
]
}
}
```
That's it. PostToolUse redacts secret values from tool output before they reach the LLM. PreToolUse blocks forbidden writes and dangerous commands.
## How It Works
fermata interposes on every tool call in the agent's lifecycle:
```
Agent wants to run a tool
|
PreToolUse ── .botsecrets [policy] / .botignore
| write blocked? → deny
| bash denied? → deny
| otherwise → allow (including reads of .env!)
|
Tool executes
|
PostToolUse ── .botsecrets [files] + [keys] + [heuristic]
| secret values found? → redact before LLM sees it
|
Clean output enters LLM context
```
Three layers of defense, each independent:
| Layer | Mechanism | What it catches |
|-------|-----------|-----------------|
| **Known-value redaction** | `.botsecrets` declares secret files; fermata parses them and builds an Aho-Corasick automaton | Every occurrence of a declared secret value, in any tool output, regardless of source |
| **Heuristic detection** | Regex patterns from gitleaks detect undeclared secrets (AWS keys, JWTs, GitHub PATs, database URLs) | Secrets not covered by the manifest -- runtime-generated, unexpected locations |
| **Access control** | `.botsecrets [policy]` / `.botignore` rules block writes and dangerous commands | Destructive writes, anti-jailbreak (agent modifying its own hooks), dangerous shell commands |
Performance: ~1-5ms per tool call. Cold start (loading config + parsing secret files) is ~10-20ms.
## Configuration
### `.botsecrets` -- the primary (and usually only) config
`.botsecrets` is the unified configuration file. It declares both what to redact and what to restrict:
```toml
[files]
patterns = [".env", ".env.*", "secrets.*"]
[keys]
include = ["STRIPE_*", "MY_APP_SIGNING_*"]
[heuristic]
enabled = true
# Access control: write protection and bash safety.
# Reading secret-containing files is allowed -- Layer 1 redacts the values.
[policy.write]
patterns = [".claude/**", "vendor/**", "*.lock"]
[policy.bash]
deny = ["rm -rf /", "curl * | sh"]
```
Built-in key patterns (`*_KEY`, `*_SECRET`, `*_PASSWORD`, `*_TOKEN`, `DATABASE_URL`, etc.) handle most projects without custom configuration.
### `.botignore` -- optional simple layer
Gitignore syntax. For projects that want a minimal, familiar format for write protection. Complements `.botsecrets` but is not required.
```gitignore
vendor/**
*.lock
```
See [docs/configuration.md](docs/configuration.md) for the full reference.
## Commands
```bash
# Check if a path is allowed
fermata check --op read /path/to/.env # exit 1 = blocked
fermata check --op write src/main.rs # exit 0 = allowed
# Run as a hook (reads harness JSON from stdin)
fermata hook --harness claude
fermata hook --harness claude --event post-tool-use
```
See [docs/commands.md](docs/commands.md) for the full CLI reference.
## Library API
fermata is also a Rust library:
```rust
use dirigent_fermata::core::secrets::{Manifest, Redactor, Scanner, SecretsConfig};
// Load .botsecrets and build the redaction manifest
let config = SecretsConfig::load("/path/to/project")?;
let manifest = Manifest::discover(&config)?;
// Known-value redaction (Aho-Corasick, sub-millisecond)
let redactor = Redactor::from_manifest(&manifest);
let clean = redactor.redact("DB_PASSWORD=hunter2");
// -> "DB_PASSWORD=*****"
// Heuristic scanning (regex patterns)
let scanner = Scanner::new(&config);
let findings = scanner.scan("Found key: AKIA1234567890ABCDEF");
// -> [Finding { pattern: "AWS Access Key", confidence: High, .. }]
```
## Security Model
fermata addresses a novel security concern: **reveal** -- whether secret *values* enter the LLM context. Traditional file-level access control operates on file identity (which file). Secret redaction operates on data content (which values). The reveal problem can only be solved at the data-content level.
Read [docs/security-model.md](docs/security-model.md) for the full analysis, including the Reveal Triangle and defense-in-depth architecture.
## Threat Model
fermata is a heuristic guard, not a sandbox. It defends against statistical agent behavior and prompt-driven mistakes -- not a deliberate adversary. This is a strength: the threat model is well-defined, and the boundaries are documented honestly.
Read [docs/threat-model.md](docs/threat-model.md) for what fermata catches, what it doesn't, and what to combine it with.
## Harness Support
| Harness | Status | Mechanism |
|---------|--------|-----------|
| Claude Code | Shipped | PreToolUse + PostToolUse hooks |
| Codex CLI | Planned | Pre-exec hook adapter |
| Gemini CLI | Planned | MCP server mode |
| Any MCP agent | Planned | MCP proxy wrapping existing servers |
The policy engine and redaction logic are identical across all modes. Only the I/O adapter changes.
## Status
v0.2 -- secret filtering engine and policy gate are production-ready. All core components are implemented and tested:
- `.botsecrets` config with `[files]`, `[keys]`, `[heuristic]`, and `[policy]` sections
- Aho-Corasick known-value redactor
- Heuristic scanner with gitleaks-derived patterns
- Manifest discovery, multi-format parser (.env, TOML, YAML, JSON)
- Claude Code PreToolUse and PostToolUse adapters
- `.botignore` walker with gitignore semantics
## The `.botsecrets` Vision
`.botsecrets` is designed to be the **`.gitignore` of AI agent security**: a simple, declarative, human-readable file that every project can drop in to protect its secrets from AI agents.
The format is harness-agnostic from day one. It declares *what* to protect, not *how*. One file covers both redaction (`[files]`, `[keys]`, `[heuristic]`) and access control (`[policy]`). The same `.botsecrets` works with Claude Code, Codex, Gemini, and any future harness that supports tool lifecycle hooks.
## License
Licensed under either of [Apache License, Version 2.0](LICENSE-APACHE) or [MIT License](LICENSE-MIT) at your option.