New user-friendly README modeled after sandcage's layout (Why / Quick Start / How It Works), plus four focused docs under docs/: - commands.md — full CLI reference with options, exit codes, examples - configuration.md — .botignore, botignore.toml, .botsecrets reference - security-model.md — the Reveal Triangle and defense-in-depth layers - threat-model.md — L0-L6 coverage, honest limitations, pairing guidance All Dirigent/monorepo internals stripped — ready for standalone export. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
7.7 KiB
fermata
A fast, harness-agnostic security layer for AI coding agents.
AI coding agents read files, run commands, and inspect output as part of their normal workflow. When they read .env, secret values get tokenized into the LLM's context window -- and from there they can leak into commits, PR descriptions, log messages, or API calls. No AI coding agent ships built-in post-read secret filtering today. fermata fixes that.
Why
Traditional security blocks the file and hopes the agent doesn't find the data through another path. This is insufficient -- secrets appear in shell output, log files, error messages, and indirect reads that bypass any access-control list.
fermata operates on two independent levels:
- Policy gate (PreToolUse) --
.botignoreblocks reads, writes, and dangerous commands before they execute. Catches ~90% of accidental secret access. - Secret filtering (PostToolUse) --
.botsecretsredacts secret values from tool output before they enter the LLM context. Catches the remaining cases regardless of how secrets appear.
The key insight: blocking a file is necessary but not sufficient. The agent can have read access to .env without secret values being revealed -- if the output is redacted before it reaches the model.
Quick Start
Install
cargo install --path . --features cli
Protect a project in 30 seconds
# Block direct access to secret files
echo ".env" > .botignore
# Declare where secrets live -- fermata parses them and redacts values
cat > .botsecrets << 'EOF'
[files]
patterns = [".env", ".env.*", "secrets.*"]
EOF
Wire into Claude Code
Add both hooks in .claude/settings.json:
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash|Read|Edit|Write",
"hooks": [
{ "type": "command", "command": "fermata hook --harness claude" }
]
}
],
"PostToolUse": [
{
"matcher": "Bash|Read|Edit|Write",
"hooks": [
{ "type": "command", "command": "fermata hook --harness claude --event post-tool-use" }
]
}
]
}
}
That's it. PreToolUse blocks forbidden operations. PostToolUse redacts secret values from tool output before they reach the LLM.
How It Works
fermata interposes on every tool call in the agent's lifecycle:
Agent wants to run a tool
|
PreToolUse ── fermata checks .botignore / botignore.toml
| blocked? → deny with reason
| allowed? ↓
Tool executes
|
PostToolUse ── fermata scans output for secret values
| found? → replace with ***** before LLM sees it
|
Clean output enters LLM context
Three layers of defense, each independent:
| Layer | Mechanism | What it catches |
|---|---|---|
| Access control | .botignore rules block tool calls by path |
Direct reads/writes to sensitive files |
| Known-value redaction | .botsecrets declares secret files; fermata parses them and builds an Aho-Corasick automaton |
Every occurrence of a declared secret value, in any tool output, regardless of source |
| Heuristic detection | Regex patterns from gitleaks detect undeclared secrets (AWS keys, JWTs, GitHub PATs, database URLs) | Secrets not covered by the manifest -- runtime-generated, unexpected locations |
Performance: ~1-5ms per tool call. Cold start (loading config + parsing secret files) is ~10-20ms.
Configuration
Three files, each optional, each solving a different problem:
.botignore -- the 80% case
Gitignore syntax. Blocks both reads and writes. Onboarding is one line.
.env
.env.*
secrets/**
botignore.toml -- per-operation rules
Separate namespaces so the same file can be readable but not writable:
[read]
patterns = [".env*", "secrets/**"]
[write]
patterns = ["vendor/**", "*.lock"]
[bash]
deny = ["rm -rf /", "curl * | sh"]
.botsecrets -- secret value redaction
Declares which files contain secrets. fermata parses them, extracts values, and redacts every occurrence in tool output.
[files]
patterns = [".env", ".env.*", "secrets.*"]
[keys]
include = ["STRIPE_*", "MY_APP_SIGNING_*"]
[heuristic]
enabled = true
Built-in key patterns (*_KEY, *_SECRET, *_PASSWORD, *_TOKEN, DATABASE_URL, etc.) handle most projects without custom configuration.
See docs/configuration.md for the full reference.
Commands
# Check if a path is allowed
fermata check --op read /path/to/.env # exit 1 = blocked
fermata check --op write src/main.rs # exit 0 = allowed
# Run as a hook (reads harness JSON from stdin)
fermata hook --harness claude
fermata hook --harness claude --event post-tool-use
See docs/commands.md for the full CLI reference.
Library API
fermata is also a Rust library:
use dirigent_fermata::core::secrets::{Manifest, Redactor, Scanner, SecretsConfig};
// Load .botsecrets and build the redaction manifest
let config = SecretsConfig::load("/path/to/project")?;
let manifest = Manifest::discover(&config)?;
// Known-value redaction (Aho-Corasick, sub-millisecond)
let redactor = Redactor::from_manifest(&manifest);
let clean = redactor.redact("DB_PASSWORD=hunter2");
// -> "DB_PASSWORD=*****"
// Heuristic scanning (regex patterns)
let scanner = Scanner::new(&config);
let findings = scanner.scan("Found key: AKIA1234567890ABCDEF");
// -> [Finding { pattern: "AWS Access Key", confidence: High, .. }]
Security Model
fermata addresses a novel security concern: reveal -- whether secret values enter the LLM context. Traditional file-level access control operates on file identity (which file). Secret redaction operates on data content (which values). The reveal problem can only be solved at the data-content level.
Read docs/security-model.md for the full analysis, including the Reveal Triangle and defense-in-depth architecture.
Threat Model
fermata is a heuristic guard, not a sandbox. It defends against statistical agent behavior and prompt-driven mistakes -- not a deliberate adversary. This is a strength: the threat model is well-defined, and the boundaries are documented honestly.
Read docs/threat-model.md for what fermata catches, what it doesn't, and what to combine it with.
Harness Support
| Harness | Status | Mechanism |
|---|---|---|
| Claude Code | Shipped | PreToolUse + PostToolUse hooks |
| Codex CLI | Planned | Pre-exec hook adapter |
| Gemini CLI | Planned | MCP server mode |
| Any MCP agent | Planned | MCP proxy wrapping existing servers |
The policy engine and redaction logic are identical across all modes. Only the I/O adapter changes.
Status
v0.2 -- policy gate and secret filtering engine are production-ready. All core components are implemented and tested:
.botignorewalker with gitignore semanticsbotignore.tomlwith read/write/bash namespaces- Claude Code PreToolUse and PostToolUse adapters
.botsecretsconfig, manifest discovery, multi-format parser (.env, TOML, YAML, JSON)- Aho-Corasick known-value redactor
- Heuristic scanner with gitleaks-derived patterns
The .botsecrets Vision
.botsecrets is designed to be the .gitignore of AI agent security: a simple, declarative, human-readable file that every project can drop in to protect its secrets from AI agents.
The format is harness-agnostic from day one. It declares what to protect, not how. The same .botsecrets works with Claude Code, Codex, Gemini, and any future harness that supports tool lifecycle hooks.
License
Licensed under either of Apache License, Version 2.0 or MIT License at your option.