Files

T

g4borg 77520819f6 📝 fermata: rewrite docs for public-facing export

New user-friendly README modeled after sandcage's layout (Why / Quick Start /
How It Works), plus four focused docs under docs/:

- commands.md — full CLI reference with options, exit codes, examples
- configuration.md — .botignore, botignore.toml, .botsecrets reference
- security-model.md — the Reveal Triangle and defense-in-depth layers
- threat-model.md — L0-L6 coverage, honest limitations, pairing guidance

All Dirigent/monorepo internals stripped — ready for standalone export.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-05-25 18:27:51 +02:00

7.7 KiB

Raw Blame History

fermata

A fast, harness-agnostic security layer for AI coding agents.

AI coding agents read files, run commands, and inspect output as part of their normal workflow. When they read .env, secret values get tokenized into the LLM's context window -- and from there they can leak into commits, PR descriptions, log messages, or API calls. No AI coding agent ships built-in post-read secret filtering today. fermata fixes that.

Why

Traditional security blocks the file and hopes the agent doesn't find the data through another path. This is insufficient -- secrets appear in shell output, log files, error messages, and indirect reads that bypass any access-control list.

fermata operates on two independent levels:

Policy gate (PreToolUse) -- .botignore blocks reads, writes, and dangerous commands before they execute. Catches ~90% of accidental secret access.
Secret filtering (PostToolUse) -- .botsecrets redacts secret values from tool output before they enter the LLM context. Catches the remaining cases regardless of how secrets appear.

The key insight: blocking a file is necessary but not sufficient. The agent can have read access to .env without secret values being revealed -- if the output is redacted before it reaches the model.

Quick Start

Install

cargo install --path . --features cli

Protect a project in 30 seconds

# Block direct access to secret files
echo ".env" > .botignore

# Declare where secrets live -- fermata parses them and redacts values
cat > .botsecrets << 'EOF'
[files]
patterns = [".env", ".env.*", "secrets.*"]
EOF

Wire into Claude Code

Add both hooks in .claude/settings.json:

{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash|Read|Edit|Write",
        "hooks": [
          { "type": "command", "command": "fermata hook --harness claude" }
        ]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "Bash|Read|Edit|Write",
        "hooks": [
          { "type": "command", "command": "fermata hook --harness claude --event post-tool-use" }
        ]
      }
    ]
  }
}

That's it. PreToolUse blocks forbidden operations. PostToolUse redacts secret values from tool output before they reach the LLM.

How It Works

fermata interposes on every tool call in the agent's lifecycle:

Agent wants to run a tool
        |
   PreToolUse ── fermata checks .botignore / botignore.toml
        |            blocked? → deny with reason
        |            allowed? ↓
   Tool executes
        |
   PostToolUse ── fermata scans output for secret values
        |            found? → replace with ***** before LLM sees it
        |
   Clean output enters LLM context

Three layers of defense, each independent:

Layer	Mechanism	What it catches
Access control	`.botignore` rules block tool calls by path	Direct reads/writes to sensitive files
Known-value redaction	`.botsecrets` declares secret files; fermata parses them and builds an Aho-Corasick automaton	Every occurrence of a declared secret value, in any tool output, regardless of source
Heuristic detection	Regex patterns from gitleaks detect undeclared secrets (AWS keys, JWTs, GitHub PATs, database URLs)	Secrets not covered by the manifest -- runtime-generated, unexpected locations

Performance: ~1-5ms per tool call. Cold start (loading config + parsing secret files) is ~10-20ms.

Configuration

Three files, each optional, each solving a different problem:

`.botignore` -- the 80% case

Gitignore syntax. Blocks both reads and writes. Onboarding is one line.

.env
.env.*
secrets/**

`botignore.toml` -- per-operation rules

Separate namespaces so the same file can be readable but not writable:

[read]
patterns = [".env*", "secrets/**"]

[write]
patterns = ["vendor/**", "*.lock"]

[bash]
deny = ["rm -rf /", "curl * | sh"]

`.botsecrets` -- secret value redaction

Declares which files contain secrets. fermata parses them, extracts values, and redacts every occurrence in tool output.

[files]
patterns = [".env", ".env.*", "secrets.*"]

[keys]
include = ["STRIPE_*", "MY_APP_SIGNING_*"]

[heuristic]
enabled = true

Built-in key patterns (*_KEY, *_SECRET, *_PASSWORD, *_TOKEN, DATABASE_URL, etc.) handle most projects without custom configuration.

See docs/configuration.md for the full reference.

Commands

# Check if a path is allowed
fermata check --op read /path/to/.env     # exit 1 = blocked
fermata check --op write src/main.rs       # exit 0 = allowed

# Run as a hook (reads harness JSON from stdin)
fermata hook --harness claude
fermata hook --harness claude --event post-tool-use

See docs/commands.md for the full CLI reference.

Library API

fermata is also a Rust library:

use dirigent_fermata::core::secrets::{Manifest, Redactor, Scanner, SecretsConfig};

// Load .botsecrets and build the redaction manifest
let config = SecretsConfig::load("/path/to/project")?;
let manifest = Manifest::discover(&config)?;

// Known-value redaction (Aho-Corasick, sub-millisecond)
let redactor = Redactor::from_manifest(&manifest);
let clean = redactor.redact("DB_PASSWORD=hunter2");
// -> "DB_PASSWORD=*****"

// Heuristic scanning (regex patterns)
let scanner = Scanner::new(&config);
let findings = scanner.scan("Found key: AKIA1234567890ABCDEF");
// -> [Finding { pattern: "AWS Access Key", confidence: High, .. }]

Security Model

fermata addresses a novel security concern: reveal -- whether secret values enter the LLM context. Traditional file-level access control operates on file identity (which file). Secret redaction operates on data content (which values). The reveal problem can only be solved at the data-content level.

Read docs/security-model.md for the full analysis, including the Reveal Triangle and defense-in-depth architecture.

Threat Model

fermata is a heuristic guard, not a sandbox. It defends against statistical agent behavior and prompt-driven mistakes -- not a deliberate adversary. This is a strength: the threat model is well-defined, and the boundaries are documented honestly.

Read docs/threat-model.md for what fermata catches, what it doesn't, and what to combine it with.

Harness Support

Harness	Status	Mechanism
Claude Code	Shipped	PreToolUse + PostToolUse hooks
Codex CLI	Planned	Pre-exec hook adapter
Gemini CLI	Planned	MCP server mode
Any MCP agent	Planned	MCP proxy wrapping existing servers

The policy engine and redaction logic are identical across all modes. Only the I/O adapter changes.

Status

v0.2 -- policy gate and secret filtering engine are production-ready. All core components are implemented and tested:

.botignore walker with gitignore semantics
botignore.toml with read/write/bash namespaces
Claude Code PreToolUse and PostToolUse adapters
.botsecrets config, manifest discovery, multi-format parser (.env, TOML, YAML, JSON)
Aho-Corasick known-value redactor
Heuristic scanner with gitleaks-derived patterns

The `.botsecrets` Vision

.botsecrets is designed to be the .gitignore of AI agent security: a simple, declarative, human-readable file that every project can drop in to protect its secrets from AI agents.

The format is harness-agnostic from day one. It declares what to protect, not how. The same .botsecrets works with Claude Code, Codex, Gemini, and any future harness that supports tool lifecycle hooks.

License

Licensed under either of Apache License, Version 2.0 or MIT License at your option.

7.7 KiB Raw Blame History