# Fermata Threat Model What fermata catches, what it doesn't, and why we tell you both. --- ## Framing: heuristic guard, not sandbox Fermata is **not** a defence against a deliberate, optimising adversary trying to escape the box. It is a defence against: - **Statistical agent behaviour** -- the playbook of moves an LLM-driven coding agent reaches for when solving a class of tasks, including its mistakes: overly-broad globs, "let me just `cat` everything", picking up a stray `.env` because it pattern-matched the rest of the directory. - **Non-malicious harnesses** -- Claude Code, Codex, Gemini and similar agents that are not actively trying to circumvent policy. - **Prompt-driven mistakes** -- the user's prompt or a tool's output nudges the model into touching something it shouldn't, without anyone meaning to. This shifts the design centre from "what can a smart attacker hide?" to "what does an unguided LLM commonly do, and how do we catch the dangerous fraction of that?" Static analysis becomes tractable when grounded in **what is usual**, not what is adversarially possible. Think of fermata as a guardrail on a mountain road, not an armoured wall. It catches the statistical case -- the distracted driver drifting toward the edge -- not the determined off-roader who drives around it. --- ## What fermata catches Fermata's detection is organised by how the agent names the path or secret it is about to access. Lower levels are easier to detect statically; higher levels require progressively more runtime information. ### L0: Direct path arguments -- fully covered The agent calls a path-typed tool (Read, Write, Edit) with an explicit file path. This is fermata's home turf. > `Read({"file_path": "/home/user/.env"})` -- `.botignore` matches -- deny. Both `.botignore` (gitignore semantics, deepest directory wins, negation patterns work as in git) and `botignore.toml` `[read]`/`[write]` glob lists are evaluated against the resolved path. This covers the overwhelming majority of accidental secret-file reads, because most agents use direct path tools as their primary file access method. ### L1: Absolute paths inside Bash commands The agent runs a shell command that mentions a full path -- one containing a directory separator or drive letter. > `cat /home/user/.env`, `type C:\Users\me\secret.toml` Fermata's path extraction engine recognises these as high-confidence path candidates and checks them against the same policy rules as L0. Windows paths are handled conservatively (backslashes appear in many non-path contexts), but the common cases are covered. ### L2: Bare filenames resolvable in the working directory The agent names a file without any path separator, relying on the shell's working directory to resolve it. > `cat .env`, `cat README.md` These are flagged as low-confidence path candidates. When the agent's working directory is known, fermata resolves bare filenames against it and evaluates them through the policy. Low-confidence detections are biased toward asking the user rather than silently denying, because most bare tokens with file extensions are not actually sensitive paths -- false denials break the agent's workflow and are noticed immediately, while false allows may not be. ### L3: Static wildcards in commands The agent uses shell glob patterns in a command that could expand to match sensitive files. > `cat .env*`, `grep SECRET *.toml`, `find . -name '*.pem'` Fermata handles this by checking whether the glob pattern in the command **overlaps** with any configured policy rule. For example, `.env*` in a command is checked against `.botignore` patterns -- if `.env` is protected, `.env*` is flagged. This does not require enumerating the filesystem; it is a pattern-vs-pattern overlap check. The limitation: fermata can tell whether a glob *could* hit a protected file, but not whether it *actually will* in the current directory state. For the heuristic threat model, "could hit" is the right bar. ### Secret redaction (PostToolUse) Independent of path-level detection, fermata filters tool output before it enters the LLM context: - **Known-value redaction** -- `.botsecrets` declares which files contain secrets. Fermata parses those files, extracts the secret values, and replaces them in all tool output using an Aho-Corasick automaton. Sub-millisecond performance. Zero false negatives for declared secrets. - **Heuristic scanning** -- regex patterns (derived from gitleaks) detect undeclared secrets in tool output: AWS access keys, JWTs, GitHub PATs, database connection strings, and similar high-entropy tokens. This is a safety net for secrets not covered by the manifest. Together these mean that even if a path-level check misses a read (the agent reaches `.env` through a route fermata didn't parse), the secret values themselves are scrubbed from the output before the LLM sees them. --- ## What fermata does NOT catch These are honest boundaries, not future promises. Documenting them is part of the value -- it tells you where you still need other defences. ### L4: Shell indirection Variable expansion (`cat $SECRET_FILE`), command substitution (`cat $(ls .env*)`), tool composition (`xargs cat < file_list`), and pipelines that route content through intermediate steps. A real `bash` parser would handle the syntax, but cannot resolve runtime values. Fermata sees the command string, not the expanded result. It can detect the presence of substitution or variable-dereference syntax and escalate to an ask, but it cannot statically evaluate what the command will actually do. **Why this is acceptable for the threat model:** LLM agents overwhelmingly prefer direct, readable commands. Shell indirection is rare in practice -- agents reach for `cat /path/to/file`, not `cat $(echo /path/to/file)`. The statistical frequency of L4 evasion in non-adversarial agent behaviour is low. ### L5: Derived or out-of-band names The agent computes a path from sources fermata never sees: its own reasoning, an HTTP response, a previous tool's stdout that the harness reused as input. > The agent reads a config file, learns about `/etc/app/signing.key`, then reads it in a subsequent call. **Important:** most derived-name scenarios collapse back to L0 at the moment of the dangerous call. The second read is a direct path-typed tool call, and `.botignore` catches it normally. Fermata does not need to predict the derivation; it just needs the rule to match the resolved path when the call happens. The genuinely difficult sub-case is when a derived name is used inside L4-style shell indirection -- which is no harder than L4 already is. ### L6: Application-protocol exfiltration -- out of scope by design The agent uses a legitimate tool (HTTP client, git push, shell with network access) to move forbidden content to a destination fermata cannot inspect. The content travels as the *payload* of an allowed call, not as a path. > The agent reads a secret (caught and redacted at L0), but then reconstructs it character-by-character and sends it via an HTTP request. This is the network-firewall analogy: **you cannot block exfiltration over an allowed application protocol from outside the application.** A packet filter cannot inspect encrypted traffic without MITM; a static policy gate cannot inspect the semantic intent of an allowed tool's payload. Fermata marks this as permanently out of scope. This is not a failure -- it is a design boundary. Trying to solve L6 within a static policy gate would require either breaking the tool's legitimate use (blocking all network access) or implementing deep content inspection that defeats the latency and side-effect-free constraints that make fermata practical. --- ## Coverage summary | Level | Description | Detection | Confidence | |-------|-------------|-----------|------------| | L0 | Direct path argument | Full policy check | High | | L1 | Absolute path in Bash | Path extraction + policy | High | | L2 | Bare filename in CWD | CWD resolution + policy | Medium | | L3 | Wildcards in commands | Pattern overlap check | Medium | | L4 | Shell indirection | Syntax detection only | Low | | L5 | Derived names | Collapses to L0 at call time | Depends on final call | | L6 | Protocol exfiltration | Out of scope | N/A | Secret redaction (PostToolUse) operates as a separate, independent layer. Even when path-level detection at L1-L4 misses a read, known-value redaction catches the secret values in the output. --- ## Practical implications Fermata is one layer in a defence-in-depth stack. Here is what it is good at and what to pair it with. ### Fermata alone gives you - **Accidental secret exposure prevention.** The most common case: agent reads `.env`, `secrets.toml`, or a credential file because it matched a directory listing. Fermata blocks this at L0 with zero configuration beyond `.botignore`. - **Known-secret scrubbing.** Even if a secret leaks through an indirect read, declared secrets are redacted from LLM context. The agent never "learns" the value. - **Heuristic secret detection.** Undeclared secrets matching common formats (AWS keys, tokens, connection strings) are flagged in tool output. - **Command guardrails.** Dangerous shell patterns (`rm -rf /`, `curl | sh`) are caught by configurable deny lists. ### Combine fermata with - **Network-level controls** for L6 scenarios. If your threat model includes data exfiltration by a compromised or misbehaving agent, restrict outbound network access at the OS or container level. Fermata cannot inspect what an HTTP client sends. - **Sandboxing / containerisation** for hard isolation. Fermata is a policy gate, not a sandbox. If you need filesystem isolation guarantees (not just heuristic blocking), run the agent in a container with a restricted mount. - **Secret rotation** as a backstop. If a secret does leak into LLM context through an uncovered path, rotating the secret limits the blast radius. Fermata's redaction makes this unlikely for declared secrets, but rotation is good practice regardless. - **Audit logging** for visibility. Fermata's decisions can be logged for review. Pairing with an audit trail lets you detect patterns that individual checks might miss. ### Design constraints worth knowing - **Latency budget.** Fermata fires on every tool call and targets single-digit milliseconds. Anything requiring filesystem enumeration or network calls at check time is either opt-in or out of scope. - **Side-effect-free.** Fermata does not open, move, stat, or modify files outside of its policy resolution. It operates purely on the information the harness provides. - **False deny vs false allow.** A false deny breaks the agent's workflow and is noticed immediately. A false allow may go undetected. Low-confidence detections bias toward asking the user, not silently denying.