fermata/README.md

# 𝄐 fermata

**The security layer for AI coding agents.**

AI coding agents read files, run shell commands, and inspect output as part of normal work. When they read `.env`, the secret values get tokenized into the LLM's context window. From there, they can leak into commits, PR descriptions, or API calls the agent makes. The secret is irrecoverably revealed.

fermata sits between the agent and its tools. It blocks operations that shouldn't happen, and scrubs secret values from the output of operations that should.

> [!CAUTION]
> **Alpha software.** Fermata is functional and in daily use by the author, but not widely tested across diverse environments. The core library and Claude Code hook adapters are production-grade; other features are earlier in maturity. Expect rough edges and breaking changes.

---

## The Problem

Traditional security blocks the file. But secrets also appear in shell output, log files, error messages, environment variable dumps, and indirect reads that bypass any access-control list.

<p align="center">
  <img src="threat-landscape.svg" alt="Where secrets leak from — blocking the file is necessary but not sufficient" width="720">
</p>

The actual concern is not "can the agent open this file?" but "do secret *values* enter the LLM context?" An agent can have read access to `.env` without the secret values being revealed — if the output is redacted before it reaches the model.

---

## How It Works

fermata interposes on the tool lifecycle at two points:

<p align="center">
  <img src="interception-flow.svg" alt="How fermata intercepts — PreToolUse blocks, PostToolUse redacts" width="720">
</p>

**PreToolUse** — Before the tool executes, fermata checks `.botsecrets [policy]` and `.botignore` rules against the operation. A blocked write never happens. A blocked command never runs. Most harnesses already handle basic file blocking, but fermata catches stragglers and works in permissive/yolo modes too.

**PostToolUse** — After the tool executes, fermata scans the output for secret values. Declared secrets (loaded from files matched by `.botsecrets`) are replaced using an Aho-Corasick automaton — zero false negatives, sub-millisecond. A secondary heuristic scan catches undeclared secrets that match known formats (AWS keys, JWTs, GitHub PATs, database URLs). This is the primary defense layer.

This means `source .env && echo $DB_PASSWORD` is caught even though no file read was blocked — the secret value itself is scrubbed from the output before the LLM ever sees it.

---

## Quick Start

### Install

```bash
cargo install --git https://git.g4b.org/dirigence/fermata --features cli
```

Requires a working [Rust toolchain](https://rustup.rs).

### Protect a project

Create a `.botsecrets` file at your project root — the primary (and usually only) config you need:

```toml
# .botsecrets
[files]
patterns = [".env", ".env.*", "secrets.*"]

[policy.write]
patterns = [".claude/**", "vendor/**", "*.lock"]

[policy.bash]
deny = ["rm -rf /", "curl * | sh"]
```

One file. The agent can read `.env` freely — fermata redacts the secret values from the output before they reach the model. Write protection and bash safety rules live in the same `.botsecrets` under `[policy]`.

fermata ships with built-in key patterns (`*_KEY`, `*_SECRET`, `*_PASSWORD`, `*_TOKEN`, `DATABASE_URL`, and ~25 more) that cover the common cases automatically.

> **Note:** fermata also accepts `fermata.toml` as an alias for `.botsecrets` (same format, `.botsecrets` takes priority when both exist).

Optionally, add a `.botignore` for simple path blocking using gitignore syntax:

```gitignore
# .botignore (optional — complements .botsecrets)
vendor/
*.lock
```

### Wire into Claude Code

Add both hooks in `.claude/settings.json`:

```json
{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash|Read|Edit|Write",
        "hooks": [
          { "type": "command", "command": "fermata hook --harness claude" }
        ]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "Bash|Read|Edit|Write",
        "hooks": [
          { "type": "command", "command": "fermata hook --harness claude --event post-tool-use" }
        ]
      }
    ]
  }
}
```

PostToolUse redacts secret values from output before they reach the LLM. PreToolUse blocks forbidden writes and dangerous commands.

---

## What Fermata Does Not Do

fermata is a heuristic guard, not a sandbox. It defends against statistical agent behavior — the unguided LLM reaching for `.env`, the overly-broad glob, the stray `cat` of a credential file. It does not defend against a deliberate adversary trying to escape the box.

Things fermata cannot catch:

- **Network exfiltration** — an agent sending secrets via `curl` or `git push`. Use network-level controls (firewall, container networking) for this.
- **Kernel-level file access** — a process bypassing tool hooks entirely. Use container isolation or a sandbox for hard filesystem boundaries.
- **Character-by-character reconstruction** — an adversarial agent reassembling a secret across multiple tool calls.

These are honest boundaries, not future promises. See [docs/threat-model.md](docs/threat-model.md) for the full analysis.

---

## Configuration

**`.botsecrets`** is the primary configuration file. It declares which files contain secrets (`[files]`), how to redact them (`[redaction]`), and optionally embeds access-control policy (`[policy.write]`, `[policy.bash]`, `[policy.read]`). Most projects need only this file. `.botsecrets` can do everything `.botignore` can and more.

**`.botignore`** uses gitignore syntax to block reads and writes. Useful for monorepo subtree exclusion or teams that prefer gitignore syntax for simple path blocking. Complements `.botsecrets` but is not required.

See [docs/configuration.md](docs/configuration.md) for the full reference with examples.

---

## Status

v0.2 — secret filtering engine and policy gate are production-ready:

| Component | Status | Maturity |
|-----------|--------|----------|
| `.botsecrets` config + `[policy]` section | Done | production |
| `.botignore` walker (gitignore semantics) | Done | production |
| Known-value redactor (Aho-Corasick) | Done | production |
| Heuristic scanner (gitleaks-derived patterns) | Done | production |
| Multi-format secret parser (.env, TOML, YAML, JSON) | Done | production |
| Claude Code PreToolUse + PostToolUse adapters | Done | production |
| CLI: `fermata check` and `fermata hook` | Done | production |

Out of scope for v0.2: Codex / Gemini hook adapters, MCP server mode, audit log, filesystem watcher.

---

## Harness Support

| Harness | Status | Mechanism |
|---------|--------|-----------|
| Claude Code | Shipped | PreToolUse + PostToolUse hooks |
| Codex CLI | Planned | Hook adapter |
| Gemini CLI | Planned | MCP server mode |
| Any MCP agent | Planned | MCP proxy wrapping existing servers |
| Any shell-based hook | Supported | CLI exit codes |

The policy engine and redaction logic are identical across all modes. Only the I/O adapter changes.

---

## Background

fermata addresses a novel security concern — **reveal**: whether secret *values* enter the LLM context, independent of whether the agent can open a file. This distinction (file identity vs. data content) is explored in:

- [docs/security-model.md](docs/security-model.md) — the Reveal Triangle and defense-in-depth architecture
- [docs/threat-model.md](docs/threat-model.md) — what fermata catches at each detection level, and where it stops
- [docs/commands.md](docs/commands.md) — full CLI reference

The `.botsecrets` format is designed to be the **`.gitignore` of AI agent security**: a simple, declarative, harness-agnostic file that every project can drop in. The portable sections (`[files]`, `[keys]`, `[redaction]`, `[heuristic]`) declare *what* to protect; the `[policy]` section adds fermata-specific access control.

---

## Part of Dirigent

Fermata is the security subsystem of [Dirigent](https://git.g4b.org/dirigence/dirigent), a multi-agent orchestration platform. It is developed in the upstream monorepo and exported here for standalone use — no other Dirigent component is required.

---

## License

Licensed under either of

- Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE))
- MIT License ([LICENSE-MIT](LICENSE-MIT))

at your option.