✨ feat(fermata): add secret filtering engine — the security brain
Implement Goals 1–3 and 5 from the reveal-layer security brain goal.
fermata now detects, redacts, and scans for secrets in AI agent tool
output, filling the ecosystem gap where no coding agent filters secrets
post-read.
New core/secrets/ module:
- config.rs: .botsecrets TOML format with hierarchical merge and ~40
built-in key patterns
- parser.rs: multi-format secret file parser (.env, TOML, YAML, JSON,
Python assignments, Java properties)
- manifest.rs: file discovery + parsing → known-secrets set
- redactor.rs: Aho-Corasick multi-pattern replacement with 4 styles
- scanner.rs: RegexSet heuristic detection with 35 gitleaks-derived
patterns (MIT) and Shannon entropy filtering
- patterns.rs: curated rules for AWS, GitHub, Stripe, Slack, JWT, etc.
Hook integration:
- fermata hook --event post-tool-use reads tool output, runs redactor +
scanner, returns updatedToolOutput for Claude Code
- Backward compatible: --event pre-tool-use (default) unchanged
- Fail-open: errors produce {} and exit 0
Library API:
- Redactor::new(manifest, style).redact(text) → RedactedText
- Scanner::new(config).scan(text) → Vec<Finding>
- Compiles without CLI feature for embedding in other crates
195 tests (130 new), all passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,19 +1,26 @@
|
|||||||
# Package: dirigent_fermata
|
# Package: dirigent_fermata
|
||||||
|
|
||||||
Harness-agnostic policy gate for AI coding agents.
|
Harness-agnostic policy gate and secret filtering engine for AI coding agents.
|
||||||
|
|
||||||
## Quick Facts
|
## Quick Facts
|
||||||
- **Type**: Library + binary (`fermata`)
|
- **Type**: Library + binary (`fermata`)
|
||||||
- **Main Entry**: `src/lib.rs`, `src/bin/fermata.rs`
|
- **Main Entry**: `src/lib.rs`, `src/bin/fermata.rs`
|
||||||
- **Dependencies**: `ignore`, `toml`, `regex`, `globset`, `serde`, `clap` (cli feature)
|
- **Dependencies**: `ignore`, `toml`, `regex`, `globset`, `serde`, `clap` (cli feature), `aho-corasick`, `serde_yaml`
|
||||||
- **Status**: v0.1 — library + CLI + Claude hook adapter
|
- **Status**: v0.2 — policy gate + secret filtering engine
|
||||||
|
|
||||||
## Layering
|
## Layering
|
||||||
|
|
||||||
Three concentric layers; nothing inner imports from anything outer.
|
Three concentric layers; nothing inner imports from anything outer.
|
||||||
|
|
||||||
- **`core/`** — harness-unaware, transport-unaware, sync. Types (`Op`, `Decision`), `.botignore` walker, `botignore.toml` parser, `Policy::check` / `check_command`, path extraction. Sync, no tokio.
|
- **`core/`** — harness-unaware, transport-unaware, sync. Types (`Op`, `Decision`), `.botignore` walker, `botignore.toml` parser, `Policy::check` / `check_command`, path extraction. Sync, no tokio.
|
||||||
- **`harness/`** — `HarnessAdapter` trait over a normalized `ToolCall`. Each adapter (Claude, future Codex, etc.) lives in its own submodule, feature-gated.
|
- **`core/secrets/`** — the secret filtering engine:
|
||||||
|
- `config.rs` — `.botsecrets` TOML parser and hierarchical resolution (user, project, local override).
|
||||||
|
- `manifest.rs` — discovers secret-containing files from `.botsecrets` patterns and loads their content for redaction.
|
||||||
|
- `parser.rs` — multi-format secret file parser (`.env`, TOML, YAML, JSON). Extracts key-value pairs where the value is a secret.
|
||||||
|
- `patterns.rs` — built-in key name patterns (~30 universal patterns like `*_KEY`, `*_SECRET`, `*_PASSWORD`) and gitleaks-derived regex patterns for heuristic scanning.
|
||||||
|
- `redactor.rs` — `Redactor` builds an Aho-Corasick automaton from known secret values and replaces them in arbitrary text. Sub-millisecond performance.
|
||||||
|
- `scanner.rs` — `Scanner` applies heuristic regex patterns to detect secrets not covered by the known-value manifest (entropy-based and format-based detection).
|
||||||
|
- **`harness/`** — `HarnessAdapter` trait over a normalized `ToolCall` (PreToolUse) and `PostToolUsePayload` (PostToolUse). Each adapter (Claude, future Codex, etc.) lives in its own submodule, feature-gated. PostToolUse enables output redaction via `updatedToolOutput` before content enters the LLM context.
|
||||||
- **`bin/fermata.rs`** — only place where `clap`, stdio, and exit codes appear.
|
- **`bin/fermata.rs`** — only place where `clap`, stdio, and exit codes appear.
|
||||||
|
|
||||||
## Release Model
|
## Release Model
|
||||||
@@ -24,11 +31,13 @@ Developed in this monorepo; planned to be exported as a standalone repo in the f
|
|||||||
|
|
||||||
`dirigent_tools` depends on `dirigent_fermata`, never the reverse. Fermata must remain usable as a standalone hook/MCP without dragging in the in-process ACP tool runtime.
|
`dirigent_tools` depends on `dirigent_fermata`, never the reverse. Fermata must remain usable as a standalone hook/MCP without dragging in the in-process ACP tool runtime.
|
||||||
|
|
||||||
## Out of scope (v0.1)
|
## Out of scope (v0.2)
|
||||||
|
|
||||||
Codex / Gemini hook adapters, MCP server mode, PostToolUse envelope, `readonly_only` Bash mode, audit log, filesystem watcher. Each is a future task with its own plan.
|
Codex / Gemini hook adapters, MCP server mode, `readonly_only` Bash mode, audit log, filesystem watcher, context taint tracking. Each is a future task with its own plan.
|
||||||
|
|
||||||
## See also
|
## See also
|
||||||
|
|
||||||
- `docs/tools/fermata.md` — Dirigent integration plan
|
- `docs/tools/fermata.md` — Dirigent integration plan
|
||||||
- `docs/workpad/brainstorm/fermata.md` — canonical product spec
|
- `docs/workpad/brainstorm/fermata.md` — canonical product spec
|
||||||
|
- `docs/architecture/fermata-security-philosophy.md` — security philosophy and the reveal triangle
|
||||||
|
- `.botsecrets` format: `core/secrets/config.rs` — the `.gitignore` of AI agent secret protection
|
||||||
|
|||||||
@@ -19,6 +19,7 @@ path = "src/bin/fermata.rs"
|
|||||||
required-features = ["cli"]
|
required-features = ["cli"]
|
||||||
|
|
||||||
[dependencies]
|
[dependencies]
|
||||||
|
aho-corasick = "1.1"
|
||||||
globset = "0.4"
|
globset = "0.4"
|
||||||
ignore = "0.4"
|
ignore = "0.4"
|
||||||
walkdir = "2"
|
walkdir = "2"
|
||||||
@@ -26,6 +27,7 @@ toml = "0.8"
|
|||||||
regex = "1.10"
|
regex = "1.10"
|
||||||
serde = { version = "1.0", features = ["derive"] }
|
serde = { version = "1.0", features = ["derive"] }
|
||||||
serde_json = "1.0"
|
serde_json = "1.0"
|
||||||
|
serde_yaml = "0.9"
|
||||||
thiserror = "2.0"
|
thiserror = "2.0"
|
||||||
clap = { version = "4.5", features = ["derive"], optional = true }
|
clap = { version = "4.5", features = ["derive"], optional = true }
|
||||||
|
|
||||||
|
|||||||
@@ -1,44 +1,40 @@
|
|||||||
# 𝄐 dirigent_fermata
|
# dirigent_fermata
|
||||||
|
|
||||||
**A fast, harness-agnostic policy gate for AI coding agents.**
|
**A fast, harness-agnostic policy gate and secret filtering engine for AI coding agents.**
|
||||||
|
|
||||||
Drop a `.botignore` file in your project root. Fermata reads it and blocks your agent from reading, writing, or running things it shouldn't — before the tool call happens.
|
Drop a `.botignore` to control what your agent can touch. Drop a `.botsecrets` to control what secret values your agent can see. Fermata enforces both -- before and after tool calls happen.
|
||||||
|
|
||||||
```
|
|
||||||
.env
|
|
||||||
.env.*
|
|
||||||
secrets/**
|
|
||||||
conf/settings.local.yaml
|
|
||||||
```
|
|
||||||
|
|
||||||
That's all it takes.
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Why Fermata
|
## Why Fermata
|
||||||
|
|
||||||
AI coding agents are powerful, but they don't have an innate sense of "don't touch `.env`." Native hook systems in tools like Claude Code let you intercept every file operation — but wiring up your own secure, fast hook for each project is friction. Fermata is that hook, ready to drop in.
|
AI coding agents don't have an innate sense of "don't touch `.env`" -- and even if you block the file, they can still see its contents through shell output, log files, and indirect reads. Fermata solves both problems:
|
||||||
|
|
||||||
- **Fast** — written in Rust; ~1–5ms per call. Hooks fire on every read, write, and bash operation. Python cold-start (~50–150ms) compounds fast. Fermata doesn't.
|
- **Policy gate** -- `.botignore` blocks reads, writes, and dangerous commands before they execute (PreToolUse).
|
||||||
- **Familiar syntax** — `.botignore` uses gitignore rules via the `ignore` crate (the same engine powering ripgrep).
|
- **Secret filtering** -- `.botsecrets` redacts secret values from tool output before they enter the LLM context (PostToolUse).
|
||||||
- **Per-operation control** — `botignore.toml` lets you block writes to `vendor/**` while still allowing reads, or deny specific bash patterns without touching path rules.
|
- **Fast** -- Rust, Aho-Corasick automaton for redaction, ~1-5ms per call.
|
||||||
- **Harness-agnostic** — plain CLI exit codes work from any shell wrapper; the hook adapter speaks Claude Code's JSON natively.
|
- **Familiar syntax** -- `.botignore` uses gitignore rules; `.botsecrets` uses TOML with glob patterns.
|
||||||
|
- **Harness-agnostic** -- hook adapters for Claude Code (shipped), Codex and Gemini (planned), MCP proxy (planned).
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Status: v0.1
|
## Status: v0.2
|
||||||
|
|
||||||
| Component | Status |
|
| Component | Status |
|
||||||
|-----------|--------|
|
|-----------|--------|
|
||||||
| Library (`Op`, `Decision`, `Policy::check`, `Policy::check_command`) | Done |
|
| Library (`Policy::check`, `Policy::check_command`) | Done |
|
||||||
| `.botignore` walker (project-root walk-up, gitignore semantics) | Done |
|
| `.botignore` walker (gitignore semantics) | Done |
|
||||||
| `botignore.toml` parser (read / write / bash namespaces) | Done |
|
| `botignore.toml` parser (read / write / bash namespaces) | Done |
|
||||||
| Path identification heuristics | Done |
|
| CLI: `fermata check` / `fermata hook` | Done |
|
||||||
| CLI: `fermata check <path>...` | Done |
|
|
||||||
| CLI: `fermata hook --harness claude` | Done |
|
|
||||||
| Claude Code PreToolUse adapter | Done |
|
| Claude Code PreToolUse adapter | Done |
|
||||||
|
| Claude Code PostToolUse adapter (output redaction) | Done |
|
||||||
|
| `.botsecrets` config parser | Done |
|
||||||
|
| Secret manifest discovery and loading | Done |
|
||||||
|
| Multi-format secret file parser (.env, TOML, YAML, JSON) | Done |
|
||||||
|
| `Redactor` (known-value Aho-Corasick replacement) | Done |
|
||||||
|
| `Scanner` (heuristic regex + gitleaks patterns) | Done |
|
||||||
|
|
||||||
Out of scope for v0.1: Codex / Gemini hook adapters, MCP server mode, audit log, filesystem watcher.
|
Out of scope for v0.2: Codex / Gemini hook adapters, MCP proxy mode, audit log, filesystem watcher.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -50,87 +46,43 @@ From source (this monorepo):
|
|||||||
cargo install --path crates/dirigent_fermata --features cli
|
cargo install --path crates/dirigent_fermata --features cli
|
||||||
```
|
```
|
||||||
|
|
||||||
This installs the `fermata` binary into `~/.cargo/bin/`.
|
---
|
||||||
|
|
||||||
|
## Secret Filtering
|
||||||
|
|
||||||
|
Fermata's secret filtering operates in three layers:
|
||||||
|
|
||||||
|
1. **Policy gate** (PreToolUse) -- `.botignore` blocks direct access to sensitive files. Catches ~90% of accidental reads.
|
||||||
|
2. **Known-value redaction** (PostToolUse) -- `.botsecrets` declares which files contain secrets. Fermata parses them, extracts values, and replaces them in all tool output using an Aho-Corasick automaton. Zero false negatives for declared secrets.
|
||||||
|
3. **Heuristic scanning** (PostToolUse) -- regex patterns derived from gitleaks detect undeclared secrets (AWS keys, JWTs, GitHub PATs, database URLs). Safety net for secrets not covered by the manifest.
|
||||||
|
|
||||||
|
### `.botsecrets` format
|
||||||
|
|
||||||
|
Create a `.botsecrets` file at your project root:
|
||||||
|
|
||||||
|
```toml
|
||||||
|
# Files that contain secrets -- fermata parses these and redacts values
|
||||||
|
[files]
|
||||||
|
patterns = [".env", ".env.*", "secrets.*"]
|
||||||
|
|
||||||
|
# Additional secret key names (built-in defaults cover *_KEY, *_SECRET, etc.)
|
||||||
|
[keys]
|
||||||
|
include = ["STRIPE_*", "MY_APP_SIGNING_*"]
|
||||||
|
|
||||||
|
# Heuristic scanning on all tool output
|
||||||
|
[heuristic]
|
||||||
|
enabled = true
|
||||||
|
```
|
||||||
|
|
||||||
|
That's the typical case. Built-in key patterns (`*_KEY`, `*_SECRET`, `*_PASSWORD`, `*_TOKEN`, `DATABASE_URL`, etc.) handle most projects without custom configuration.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
|
|
||||||
### Checking a path
|
### Claude Code hook configuration
|
||||||
|
|
||||||
```bash
|
Add both PreToolUse and PostToolUse hooks in `.claude/settings.json`:
|
||||||
fermata check --op read /path/to/.env
|
|
||||||
# exit 1 — blocked
|
|
||||||
# stderr: blocked by rule ".env" in /your/project/.botignore
|
|
||||||
|
|
||||||
fermata check --op write /path/to/src/main.rs
|
|
||||||
# exit 0 — allowed
|
|
||||||
```
|
|
||||||
|
|
||||||
### Claude Code hook adapter
|
|
||||||
|
|
||||||
```bash
|
|
||||||
fermata hook --harness claude < hook_payload.json
|
|
||||||
```
|
|
||||||
|
|
||||||
Reads the PreToolUse JSON from stdin, extracts the tool name and path or command, applies policy, and emits the Claude-shaped JSON response. The hook's exit code is always `0`; the verdict is in the JSON body.
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Configuration
|
|
||||||
|
|
||||||
### `.botignore` — the 80% case
|
|
||||||
|
|
||||||
Create a `.botignore` at your project root. Gitignore syntax. Blocks both reads and writes.
|
|
||||||
|
|
||||||
```gitignore
|
|
||||||
# Secrets
|
|
||||||
.env
|
|
||||||
.env.*
|
|
||||||
secrets/**
|
|
||||||
|
|
||||||
# Local config overrides
|
|
||||||
conf/settings.local.yaml
|
|
||||||
conf/settings.test.yaml
|
|
||||||
|
|
||||||
# Generated files — let the tools rebuild them, not patch them
|
|
||||||
dist/**
|
|
||||||
*.lock
|
|
||||||
```
|
|
||||||
|
|
||||||
Fermata walks up from the target file to find the nearest `.botignore`, so it works correctly even when an agent changes directory.
|
|
||||||
|
|
||||||
### `botignore.toml` — per-operation rules
|
|
||||||
|
|
||||||
For cases where `.botignore`'s uniform read+write block isn't granular enough:
|
|
||||||
|
|
||||||
```toml
|
|
||||||
[read]
|
|
||||||
# Block reading secrets outright
|
|
||||||
patterns = [".env*", "secrets/**", "conf/settings.local.yaml"]
|
|
||||||
|
|
||||||
[write]
|
|
||||||
# Allow reading vendor code but block patching it
|
|
||||||
patterns = ["vendor/**", "*.lock"]
|
|
||||||
|
|
||||||
[bash]
|
|
||||||
# Hard-block destructive or exfiltrating commands
|
|
||||||
deny = [
|
|
||||||
"rm -rf /",
|
|
||||||
"curl * | sh",
|
|
||||||
"git push --force*",
|
|
||||||
]
|
|
||||||
# Ask before any removal or move
|
|
||||||
ask = ["rm:*", "mv:*"]
|
|
||||||
# Narrow allowlist for automated commands
|
|
||||||
allow_prefixes = ["make test", "git checkout:*"]
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## How it fits into Claude Code
|
|
||||||
|
|
||||||
Add fermata as a `PreToolUse` hook in `.claude/settings.json`:
|
|
||||||
|
|
||||||
```json
|
```json
|
||||||
{
|
{
|
||||||
@@ -139,10 +91,15 @@ Add fermata as a `PreToolUse` hook in `.claude/settings.json`:
|
|||||||
{
|
{
|
||||||
"matcher": "Bash|Read|Edit|Write",
|
"matcher": "Bash|Read|Edit|Write",
|
||||||
"hooks": [
|
"hooks": [
|
||||||
{
|
{ "type": "command", "command": "fermata hook --harness claude" }
|
||||||
"type": "command",
|
]
|
||||||
"command": "fermata hook --harness claude"
|
|
||||||
}
|
}
|
||||||
|
],
|
||||||
|
"PostToolUse": [
|
||||||
|
{
|
||||||
|
"matcher": "Bash|Read|Edit|Write",
|
||||||
|
"hooks": [
|
||||||
|
{ "type": "command", "command": "fermata hook --harness claude --event post-tool-use" }
|
||||||
]
|
]
|
||||||
}
|
}
|
||||||
]
|
]
|
||||||
@@ -150,50 +107,68 @@ Add fermata as a `PreToolUse` hook in `.claude/settings.json`:
|
|||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
When Claude attempts a `Read(.env)`, `Write(vendor/foo.js)`, or `Bash(rm ./secrets/key.pem)`, fermata intercepts the call, checks policy, and returns a deny with a human-readable reason — before any damage is done.
|
PreToolUse blocks forbidden operations. PostToolUse redacts secret values from tool output before they reach the LLM.
|
||||||
|
|
||||||
|
### Checking a path
|
||||||
|
|
||||||
|
```bash
|
||||||
|
fermata check --op read /path/to/.env
|
||||||
|
# exit 1 -- blocked
|
||||||
|
|
||||||
|
fermata check --op write /path/to/src/main.rs
|
||||||
|
# exit 0 -- allowed
|
||||||
|
```
|
||||||
|
|
||||||
|
### Library API
|
||||||
|
|
||||||
|
```rust
|
||||||
|
use dirigent_fermata::core::secrets::{Manifest, Redactor, Scanner, SecretsConfig};
|
||||||
|
|
||||||
|
// Load .botsecrets config and build the manifest
|
||||||
|
let config = SecretsConfig::load("/path/to/project")?;
|
||||||
|
let manifest = Manifest::discover(&config)?;
|
||||||
|
|
||||||
|
// Known-value redaction (Aho-Corasick, sub-millisecond)
|
||||||
|
let redactor = Redactor::from_manifest(&manifest);
|
||||||
|
let clean = redactor.redact("DB_PASSWORD=hunter2\nAPI_KEY=sk-abc123");
|
||||||
|
// -> "DB_PASSWORD=*****\nAPI_KEY=*****"
|
||||||
|
|
||||||
|
// Heuristic scanning (regex patterns)
|
||||||
|
let scanner = Scanner::new(&config);
|
||||||
|
let findings = scanner.scan("Found key: AKIA1234567890ABCDEF");
|
||||||
|
// -> [Finding { pattern: "AWS Access Key", confidence: High, .. }]
|
||||||
|
```
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## Real-world scenario
|
## Configuration
|
||||||
|
|
||||||
A project has `.env`, `conf/settings.local.yaml`, and a `vendor/` tree it doesn't want patched. With `.botignore`:
|
### `.botignore` -- access control
|
||||||
|
|
||||||
|
Gitignore syntax. Blocks both reads and writes.
|
||||||
|
|
||||||
```gitignore
|
```gitignore
|
||||||
.env
|
.env
|
||||||
.env.*
|
.env.*
|
||||||
conf/settings.local.yaml
|
secrets/**
|
||||||
vendor/**
|
|
||||||
```
|
```
|
||||||
|
|
||||||
Claude attempts to read credentials:
|
### `botignore.toml` -- per-operation rules
|
||||||
|
|
||||||
```
|
|
||||||
Tool: Read
|
|
||||||
Path: ./conf/settings.local.yaml
|
|
||||||
Decision: BLOCK — matched rule "conf/settings.local.yaml" (.botignore)
|
|
||||||
```
|
|
||||||
|
|
||||||
Claude attempts to read application code:
|
|
||||||
|
|
||||||
```
|
|
||||||
Tool: Read
|
|
||||||
Path: ./src/app/main.rs
|
|
||||||
Decision: ALLOW
|
|
||||||
```
|
|
||||||
|
|
||||||
Claude attempts to run `cat .env` via bash — which would bypass a path-only check:
|
|
||||||
|
|
||||||
```toml
|
```toml
|
||||||
# botignore.toml
|
[read]
|
||||||
|
patterns = [".env*", "secrets/**"]
|
||||||
|
|
||||||
|
[write]
|
||||||
|
patterns = ["vendor/**", "*.lock"]
|
||||||
|
|
||||||
[bash]
|
[bash]
|
||||||
deny = ["cat .env*", "cat conf/settings.local*"]
|
deny = ["rm -rf /", "curl * | sh"]
|
||||||
```
|
```
|
||||||
|
|
||||||
```
|
### `.botsecrets` -- secret value redaction
|
||||||
Tool: Bash
|
|
||||||
Command: cat .env
|
See the Secret Filtering section above.
|
||||||
Decision: BLOCK — matched bash deny rule "cat .env*"
|
|
||||||
```
|
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -201,14 +176,16 @@ Decision: BLOCK — matched bash deny rule "cat .env*"
|
|||||||
|
|
||||||
Three concentric layers; nothing inner imports from anything outer:
|
Three concentric layers; nothing inner imports from anything outer:
|
||||||
|
|
||||||
- **`core/`** — harness-unaware, sync. Types, `.botignore` walker, `botignore.toml` parser, `Policy::check` / `check_command`, path extraction.
|
- **`core/`** -- harness-unaware, sync. Policy types, `.botignore` walker, `botignore.toml` parser, `Policy::check`.
|
||||||
- **`harness/`** — `HarnessAdapter` trait over a normalized `ToolCall`. Each adapter lives in its own submodule, feature-gated.
|
- **`core/secrets/`** -- `.botsecrets` config, manifest discovery, multi-format parser, Aho-Corasick redactor, heuristic scanner.
|
||||||
- **`bin/fermata.rs`** — the only place `clap`, stdio, and exit codes appear.
|
- **`harness/`** -- `HarnessAdapter` trait for PreToolUse (policy gate) and PostToolUse (output redaction). Each adapter is feature-gated.
|
||||||
|
- **`bin/fermata.rs`** -- `clap`, stdio, and exit codes.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## See also
|
## See also
|
||||||
|
|
||||||
- `docs/tools/fermata.md` — Dirigent integration plan
|
- `docs/tools/fermata.md` -- Dirigent integration plan
|
||||||
- `docs/workpad/brainstorm/fermata.md` — full product spec and field notes
|
- `docs/architecture/fermata-security-philosophy.md` -- security philosophy and the reveal triangle
|
||||||
- `docs/architecture/crates.md` — crate dependency map
|
- `docs/workpad/brainstorm/fermata.md` -- full product spec and field notes
|
||||||
|
- `docs/architecture/crates.md` -- crate dependency map
|
||||||
|
|||||||
+180
-18
@@ -1,5 +1,6 @@
|
|||||||
use clap::{Parser, Subcommand, ValueEnum};
|
use clap::{Parser, Subcommand, ValueEnum};
|
||||||
use dirigent_fermata::core::{project::find_project_root, Decision, Op, Policy};
|
use dirigent_fermata::core::{project::find_project_root, Decision, Op, Policy};
|
||||||
|
use dirigent_fermata::harness::HookEvent;
|
||||||
use std::io::{Read, Write};
|
use std::io::{Read, Write};
|
||||||
use std::path::PathBuf;
|
use std::path::PathBuf;
|
||||||
use std::process::ExitCode;
|
use std::process::ExitCode;
|
||||||
@@ -23,7 +24,11 @@ enum Cmd {
|
|||||||
},
|
},
|
||||||
/// Read a harness hook payload from stdin and render the decision.
|
/// Read a harness hook payload from stdin and render the decision.
|
||||||
Hook {
|
Hook {
|
||||||
#[arg(long)]
|
/// Hook event type: pre-tool-use or post-tool-use.
|
||||||
|
#[arg(long, default_value = "pre-tool-use")]
|
||||||
|
event: String,
|
||||||
|
/// Harness adapter name.
|
||||||
|
#[arg(long, default_value = "claude")]
|
||||||
harness: String,
|
harness: String,
|
||||||
},
|
},
|
||||||
}
|
}
|
||||||
@@ -49,7 +54,7 @@ fn main() -> ExitCode {
|
|||||||
let cli = Cli::parse();
|
let cli = Cli::parse();
|
||||||
match cli.cmd {
|
match cli.cmd {
|
||||||
Cmd::Check { op, json, paths } => run_check(op.into(), json, &paths),
|
Cmd::Check { op, json, paths } => run_check(op.into(), json, &paths),
|
||||||
Cmd::Hook { harness } => run_hook(&harness),
|
Cmd::Hook { event, harness } => run_hook(&event, &harness),
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -92,7 +97,7 @@ fn run_check(op: Op, json: bool, paths: &[PathBuf]) -> ExitCode {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
fn run_hook(harness: &str) -> ExitCode {
|
fn run_hook(event_str: &str, harness: &str) -> ExitCode {
|
||||||
let adapter = match dirigent_fermata::harness::lookup(harness) {
|
let adapter = match dirigent_fermata::harness::lookup(harness) {
|
||||||
Some(a) => a,
|
Some(a) => a,
|
||||||
None => {
|
None => {
|
||||||
@@ -100,28 +105,51 @@ fn run_hook(harness: &str) -> ExitCode {
|
|||||||
return ExitCode::from(2);
|
return ExitCode::from(2);
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
let event = match HookEvent::parse(event_str) {
|
||||||
|
Some(e) => e,
|
||||||
|
None => {
|
||||||
|
eprintln!("fermata: unknown event '{event_str}'");
|
||||||
|
return ExitCode::from(2);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
let mut buf = Vec::new();
|
let mut buf = Vec::new();
|
||||||
if let Err(e) = std::io::stdin().lock().read_to_end(&mut buf) {
|
if let Err(e) = std::io::stdin().lock().read_to_end(&mut buf) {
|
||||||
eprintln!("fermata: stdin: {e}");
|
eprintln!("fermata: stdin: {e}");
|
||||||
return ExitCode::from(2);
|
return ExitCode::from(2);
|
||||||
}
|
}
|
||||||
let call = match adapter.parse_request(&buf) {
|
|
||||||
|
match event {
|
||||||
|
HookEvent::PreToolUse => run_pre_tool_use(&*adapter, &buf),
|
||||||
|
HookEvent::PostToolUse => run_post_tool_use(&*adapter, &buf),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Handle a PreToolUse hook event (policy gate).
|
||||||
|
fn run_pre_tool_use(
|
||||||
|
adapter: &dyn dirigent_fermata::harness::HarnessAdapter,
|
||||||
|
buf: &[u8],
|
||||||
|
) -> ExitCode {
|
||||||
|
use dirigent_fermata::harness::{PathKind, ToolOp};
|
||||||
|
|
||||||
|
let call = match adapter.parse_request(buf) {
|
||||||
Ok(c) => c,
|
Ok(c) => c,
|
||||||
Err(e) => {
|
Err(e) => {
|
||||||
eprintln!("fermata: parse: {e}");
|
eprintln!("fermata: parse: {e}");
|
||||||
return ExitCode::from(2);
|
// Fail-open: output empty JSON and exit 0.
|
||||||
|
let _ = std::io::stdout().lock().write_all(b"{}");
|
||||||
|
return ExitCode::from(0);
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
|
|
||||||
use dirigent_fermata::harness::{PathKind, ToolOp};
|
|
||||||
let decision = match &call.op {
|
let decision = match &call.op {
|
||||||
ToolOp::Path { path, kind } => {
|
ToolOp::Path { path, kind } => {
|
||||||
let root = match find_project_root(path) {
|
let root = match find_project_root(path) {
|
||||||
// No project root → fail-open allow (hook must always exit 0 with a verdict).
|
|
||||||
// run_check silently skips these paths; here we must still emit JSON.
|
|
||||||
Some(r) => r,
|
Some(r) => r,
|
||||||
None => {
|
None => {
|
||||||
let out = adapter.render_decision(&call, &Decision::Allow).unwrap_or_default();
|
let out = adapter
|
||||||
|
.render_decision(&call, &Decision::Allow)
|
||||||
|
.unwrap_or_default();
|
||||||
let _ = std::io::stdout().lock().write_all(&out);
|
let _ = std::io::stdout().lock().write_all(&out);
|
||||||
return ExitCode::from(0);
|
return ExitCode::from(0);
|
||||||
}
|
}
|
||||||
@@ -130,7 +158,9 @@ fn run_hook(harness: &str) -> ExitCode {
|
|||||||
Ok(p) => p,
|
Ok(p) => p,
|
||||||
Err(e) => {
|
Err(e) => {
|
||||||
eprintln!("fermata: load error: {e}");
|
eprintln!("fermata: load error: {e}");
|
||||||
let out = adapter.render_decision(&call, &Decision::Allow).unwrap_or_default();
|
let out = adapter
|
||||||
|
.render_decision(&call, &Decision::Allow)
|
||||||
|
.unwrap_or_default();
|
||||||
let _ = std::io::stdout().lock().write_all(&out);
|
let _ = std::io::stdout().lock().write_all(&out);
|
||||||
return ExitCode::from(0);
|
return ExitCode::from(0);
|
||||||
}
|
}
|
||||||
@@ -143,32 +173,36 @@ fn run_hook(harness: &str) -> ExitCode {
|
|||||||
Ok(d) => d,
|
Ok(d) => d,
|
||||||
Err(e) => {
|
Err(e) => {
|
||||||
eprintln!("fermata: check error: {e}");
|
eprintln!("fermata: check error: {e}");
|
||||||
let out = adapter.render_decision(&call, &Decision::Allow).unwrap_or_default();
|
let out = adapter
|
||||||
|
.render_decision(&call, &Decision::Allow)
|
||||||
|
.unwrap_or_default();
|
||||||
let _ = std::io::stdout().lock().write_all(&out);
|
let _ = std::io::stdout().lock().write_all(&out);
|
||||||
return ExitCode::from(0);
|
return ExitCode::from(0);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
ToolOp::Command { text } => {
|
ToolOp::Command { text } => {
|
||||||
// For commands, we look up the project from cwd (no path argument).
|
|
||||||
let cwd = match std::env::current_dir() {
|
let cwd = match std::env::current_dir() {
|
||||||
Ok(d) => d,
|
Ok(d) => d,
|
||||||
Err(e) => {
|
Err(e) => {
|
||||||
eprintln!("fermata: cwd error: {e}");
|
eprintln!("fermata: cwd error: {e}");
|
||||||
let out = adapter.render_decision(&call, &Decision::Allow).unwrap_or_default();
|
let out = adapter
|
||||||
|
.render_decision(&call, &Decision::Allow)
|
||||||
|
.unwrap_or_default();
|
||||||
let _ = std::io::stdout().lock().write_all(&out);
|
let _ = std::io::stdout().lock().write_all(&out);
|
||||||
return ExitCode::from(0);
|
return ExitCode::from(0);
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
match find_project_root(&cwd) {
|
match find_project_root(&cwd) {
|
||||||
// No project root → fail-open allow (see Path branch note above).
|
|
||||||
None => Decision::Allow,
|
None => Decision::Allow,
|
||||||
Some(root) => {
|
Some(root) => {
|
||||||
let policy = match Policy::load(&root) {
|
let policy = match Policy::load(&root) {
|
||||||
Ok(p) => p,
|
Ok(p) => p,
|
||||||
Err(e) => {
|
Err(e) => {
|
||||||
eprintln!("fermata: load error: {e}");
|
eprintln!("fermata: load error: {e}");
|
||||||
let out = adapter.render_decision(&call, &Decision::Allow).unwrap_or_default();
|
let out = adapter
|
||||||
|
.render_decision(&call, &Decision::Allow)
|
||||||
|
.unwrap_or_default();
|
||||||
let _ = std::io::stdout().lock().write_all(&out);
|
let _ = std::io::stdout().lock().write_all(&out);
|
||||||
return ExitCode::from(0);
|
return ExitCode::from(0);
|
||||||
}
|
}
|
||||||
@@ -177,7 +211,9 @@ fn run_hook(harness: &str) -> ExitCode {
|
|||||||
Ok(d) => d,
|
Ok(d) => d,
|
||||||
Err(e) => {
|
Err(e) => {
|
||||||
eprintln!("fermata: check error: {e}");
|
eprintln!("fermata: check error: {e}");
|
||||||
let out = adapter.render_decision(&call, &Decision::Allow).unwrap_or_default();
|
let out = adapter
|
||||||
|
.render_decision(&call, &Decision::Allow)
|
||||||
|
.unwrap_or_default();
|
||||||
let _ = std::io::stdout().lock().write_all(&out);
|
let _ = std::io::stdout().lock().write_all(&out);
|
||||||
return ExitCode::from(0);
|
return ExitCode::from(0);
|
||||||
}
|
}
|
||||||
@@ -186,9 +222,135 @@ fn run_hook(harness: &str) -> ExitCode {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
};
|
};
|
||||||
let out = adapter.render_decision(&call, &decision).unwrap_or_default();
|
let out = adapter
|
||||||
|
.render_decision(&call, &decision)
|
||||||
|
.unwrap_or_default();
|
||||||
let _ = std::io::stdout().lock().write_all(&out);
|
let _ = std::io::stdout().lock().write_all(&out);
|
||||||
ExitCode::from(0) // hook bins always exit 0; the JSON carries the verdict
|
ExitCode::from(0)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Handle a PostToolUse hook event (output redaction).
|
||||||
|
///
|
||||||
|
/// Fail-open: any error results in `{}` on stdout and exit 0, so the
|
||||||
|
/// harness continues with the original output.
|
||||||
|
fn run_post_tool_use(
|
||||||
|
adapter: &dyn dirigent_fermata::harness::HarnessAdapter,
|
||||||
|
buf: &[u8],
|
||||||
|
) -> ExitCode {
|
||||||
|
use dirigent_fermata::core::secrets::{
|
||||||
|
config::HeuristicMode, Manifest, Redactor, Scanner, SecretsConfig,
|
||||||
|
};
|
||||||
|
|
||||||
|
// Parse payload; fail-open on error.
|
||||||
|
let payload = match adapter.parse_post_tool_use(buf) {
|
||||||
|
Ok(p) => p,
|
||||||
|
Err(e) => {
|
||||||
|
eprintln!("fermata: post-tool-use parse: {e}");
|
||||||
|
let _ = std::io::stdout().lock().write_all(b"{}");
|
||||||
|
return ExitCode::from(0);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Empty tool response — nothing to redact.
|
||||||
|
if payload.tool_response.is_empty() {
|
||||||
|
let _ = std::io::stdout().lock().write_all(b"{}");
|
||||||
|
return ExitCode::from(0);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Find project root from cwd (PostToolUse has no reliable path).
|
||||||
|
let root = match std::env::current_dir().ok().and_then(|d| find_project_root(&d)) {
|
||||||
|
Some(r) => r,
|
||||||
|
None => {
|
||||||
|
// No project root → nothing to redact, pass through.
|
||||||
|
let _ = std::io::stdout().lock().write_all(b"{}");
|
||||||
|
return ExitCode::from(0);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Load secrets config; fail-open if missing or broken.
|
||||||
|
let config = match SecretsConfig::load(&root) {
|
||||||
|
Ok(c) => c,
|
||||||
|
Err(e) => {
|
||||||
|
eprintln!("fermata: secrets config: {e}");
|
||||||
|
let _ = std::io::stdout().lock().write_all(b"{}");
|
||||||
|
return ExitCode::from(0);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Build manifest from config (discovers .env files etc.).
|
||||||
|
let manifest = match Manifest::build(&config, &root) {
|
||||||
|
Ok(m) => m,
|
||||||
|
Err(e) => {
|
||||||
|
eprintln!("fermata: manifest: {e}");
|
||||||
|
let _ = std::io::stdout().lock().write_all(b"{}");
|
||||||
|
return ExitCode::from(0);
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Run redactor over tool_response.
|
||||||
|
let redactor = Redactor::new(&manifest, config.redaction.style);
|
||||||
|
let redacted = redactor.redact(&payload.tool_response);
|
||||||
|
|
||||||
|
// Run heuristic scanner if enabled.
|
||||||
|
let mut scanner_warning: Option<String> = None;
|
||||||
|
if config.heuristic.enabled {
|
||||||
|
if let Ok(scanner) = Scanner::new(&config.heuristic) {
|
||||||
|
// Scan the (already redacted) text so we don't re-flag known secrets.
|
||||||
|
let findings = scanner.scan(&redacted.text);
|
||||||
|
if !findings.is_empty() {
|
||||||
|
match config.heuristic.mode {
|
||||||
|
HeuristicMode::Report => {
|
||||||
|
// Log to stderr only; do not modify output.
|
||||||
|
for f in &findings {
|
||||||
|
eprintln!(
|
||||||
|
"fermata: heuristic finding [{:?}] {}: {}",
|
||||||
|
f.confidence, f.pattern_id, f.description
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
HeuristicMode::Enforce => {
|
||||||
|
let descriptions: Vec<String> = findings
|
||||||
|
.iter()
|
||||||
|
.map(|f| format!("{} ({})", f.description, f.pattern_id))
|
||||||
|
.collect();
|
||||||
|
scanner_warning = Some(format!(
|
||||||
|
"\n[fermata] WARNING: heuristic scan found {} potential secret(s): {}",
|
||||||
|
findings.len(),
|
||||||
|
descriptions.join(", ")
|
||||||
|
));
|
||||||
|
}
|
||||||
|
HeuristicMode::Disabled => {}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Determine whether we need to send back modified output.
|
||||||
|
let redaction_count = redacted.redactions.len();
|
||||||
|
let was_redacted = redaction_count > 0;
|
||||||
|
let needs_update = was_redacted || scanner_warning.is_some();
|
||||||
|
|
||||||
|
let output = if needs_update {
|
||||||
|
let mut text = redacted.text;
|
||||||
|
if let Some(warning) = scanner_warning {
|
||||||
|
text.push_str(&warning);
|
||||||
|
}
|
||||||
|
if was_redacted {
|
||||||
|
eprintln!(
|
||||||
|
"fermata: redacted {} secret(s) from {} output",
|
||||||
|
redaction_count, payload.tool_name
|
||||||
|
);
|
||||||
|
}
|
||||||
|
Some(text)
|
||||||
|
} else {
|
||||||
|
None
|
||||||
|
};
|
||||||
|
|
||||||
|
let out = adapter
|
||||||
|
.render_post_tool_use(&payload, output.as_deref())
|
||||||
|
.unwrap_or_else(|_| b"{}".to_vec());
|
||||||
|
let _ = std::io::stdout().lock().write_all(&out);
|
||||||
|
ExitCode::from(0)
|
||||||
}
|
}
|
||||||
|
|
||||||
fn merge_worst(a: Option<Decision>, b: Decision) -> Decision {
|
fn merge_worst(a: Option<Decision>, b: Decision) -> Decision {
|
||||||
|
|||||||
@@ -6,6 +6,7 @@ pub mod extract;
|
|||||||
pub mod op;
|
pub mod op;
|
||||||
pub mod policy;
|
pub mod policy;
|
||||||
pub mod project;
|
pub mod project;
|
||||||
|
pub mod secrets;
|
||||||
pub mod toml_config;
|
pub mod toml_config;
|
||||||
|
|
||||||
pub use decision::{Decision, Reason, Rule};
|
pub use decision::{Decision, Reason, Rule};
|
||||||
|
|||||||
+1
-1
@@ -1,7 +1,7 @@
|
|||||||
use std::path::{Path, PathBuf};
|
use std::path::{Path, PathBuf};
|
||||||
|
|
||||||
/// Strong markers that definitively identify a project root.
|
/// Strong markers that definitively identify a project root.
|
||||||
const STRONG_MARKERS: &[&str] = &["botignore.toml", ".botignore.toml", ".git"];
|
const STRONG_MARKERS: &[&str] = &["botignore.toml", ".botignore.toml", ".botsecrets", ".git"];
|
||||||
|
|
||||||
/// Walk upward from `target` (or its parent if `target` is a file) looking
|
/// Walk upward from `target` (or its parent if `target` is a file) looking
|
||||||
/// for the nearest project root. Strong markers (`botignore.toml`,
|
/// for the nearest project root. Strong markers (`botignore.toml`,
|
||||||
|
|||||||
@@ -0,0 +1,530 @@
|
|||||||
|
//! Parse and merge `.botsecrets` TOML configuration files.
|
||||||
|
//!
|
||||||
|
//! The configuration is layered (most-specific wins):
|
||||||
|
//!
|
||||||
|
//! 1. Built-in defaults
|
||||||
|
//! 2. `~/.config/fermata/.botsecrets` (user-global)
|
||||||
|
//! 3. `<root>/.botsecrets` (project)
|
||||||
|
//! 4. `<root>/.botsecrets.local` (local overrides, git-ignored)
|
||||||
|
//!
|
||||||
|
//! Vec fields like `files.patterns` are *replaced* by more-specific layers.
|
||||||
|
//! `keys.include` and `keys.exclude` *accumulate* across layers.
|
||||||
|
//! Scalar fields (style, mode, enabled) take the most-specific value.
|
||||||
|
|
||||||
|
use globset::{Glob, GlobMatcher};
|
||||||
|
use serde::{Deserialize, Serialize};
|
||||||
|
use std::path::{Path, PathBuf};
|
||||||
|
use thiserror::Error;
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Errors
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[derive(Debug, Error)]
|
||||||
|
pub enum SecretsConfigError {
|
||||||
|
#[error("io error reading {path}: {source}")]
|
||||||
|
Io {
|
||||||
|
path: PathBuf,
|
||||||
|
source: std::io::Error,
|
||||||
|
},
|
||||||
|
#[error("TOML parse error in {path}: {source}")]
|
||||||
|
Parse {
|
||||||
|
path: PathBuf,
|
||||||
|
source: toml::de::Error,
|
||||||
|
},
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Config types
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// Top-level `.botsecrets` configuration.
|
||||||
|
#[derive(Debug, Clone, Deserialize, Serialize)]
|
||||||
|
pub struct SecretsConfig {
|
||||||
|
#[serde(default)]
|
||||||
|
pub files: FilesConfig,
|
||||||
|
#[serde(default)]
|
||||||
|
pub keys: KeysConfig,
|
||||||
|
#[serde(default)]
|
||||||
|
pub redaction: RedactionConfig,
|
||||||
|
#[serde(default)]
|
||||||
|
pub heuristic: HeuristicConfig,
|
||||||
|
#[serde(default)]
|
||||||
|
pub enforcement: EnforcementConfig,
|
||||||
|
#[serde(default, rename = "file")]
|
||||||
|
pub file_overrides: Vec<FileOverride>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Deserialize, Serialize)]
|
||||||
|
pub struct FilesConfig {
|
||||||
|
#[serde(default = "default_file_patterns")]
|
||||||
|
pub patterns: Vec<String>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Deserialize, Serialize)]
|
||||||
|
pub struct KeysConfig {
|
||||||
|
#[serde(default)]
|
||||||
|
pub include: Vec<String>,
|
||||||
|
#[serde(default)]
|
||||||
|
pub exclude: Vec<String>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy, Deserialize, Serialize, PartialEq, Eq)]
|
||||||
|
#[serde(rename_all = "kebab-case")]
|
||||||
|
pub enum RedactionStyle {
|
||||||
|
Masked,
|
||||||
|
Typed,
|
||||||
|
Named,
|
||||||
|
Absent,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Deserialize, Serialize)]
|
||||||
|
pub struct RedactionConfig {
|
||||||
|
#[serde(default = "default_redaction_style")]
|
||||||
|
pub style: RedactionStyle,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy, Deserialize, Serialize, PartialEq, Eq)]
|
||||||
|
#[serde(rename_all = "kebab-case")]
|
||||||
|
pub enum HeuristicMode {
|
||||||
|
Enforce,
|
||||||
|
Report,
|
||||||
|
Disabled,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Deserialize, Serialize)]
|
||||||
|
pub struct HeuristicConfig {
|
||||||
|
#[serde(default = "default_true")]
|
||||||
|
pub enabled: bool,
|
||||||
|
#[serde(default = "default_heuristic_mode")]
|
||||||
|
pub mode: HeuristicMode,
|
||||||
|
#[serde(default)]
|
||||||
|
pub patterns: Vec<String>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy, Deserialize, Serialize, PartialEq, Eq)]
|
||||||
|
#[serde(rename_all = "kebab-case")]
|
||||||
|
pub enum EnforcementMode {
|
||||||
|
Strict,
|
||||||
|
Permissive,
|
||||||
|
Audit,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Copy, Deserialize, Serialize, PartialEq, Eq)]
|
||||||
|
#[serde(rename_all = "kebab-case")]
|
||||||
|
pub enum ParseErrorAction {
|
||||||
|
MaskEntireFile,
|
||||||
|
Allow,
|
||||||
|
Deny,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Deserialize, Serialize)]
|
||||||
|
pub struct EnforcementConfig {
|
||||||
|
#[serde(default = "default_enforcement_mode")]
|
||||||
|
pub mode: EnforcementMode,
|
||||||
|
#[serde(default = "default_parse_error_action")]
|
||||||
|
pub on_parse_error: ParseErrorAction,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Deserialize, Serialize)]
|
||||||
|
pub struct FileOverride {
|
||||||
|
pub path: String,
|
||||||
|
#[serde(default)]
|
||||||
|
pub format: Option<String>,
|
||||||
|
#[serde(default)]
|
||||||
|
pub keys: Vec<String>,
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Built-in defaults
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
pub(crate) fn default_file_patterns() -> Vec<String> {
|
||||||
|
vec![
|
||||||
|
".env",
|
||||||
|
".env.*",
|
||||||
|
"*.env",
|
||||||
|
"secrets.*",
|
||||||
|
"credentials.*",
|
||||||
|
"*.key",
|
||||||
|
"*.pem",
|
||||||
|
"*.p12",
|
||||||
|
"*.pfx",
|
||||||
|
"id_rsa",
|
||||||
|
"id_ed25519",
|
||||||
|
"id_ecdsa",
|
||||||
|
"Secrets.toml",
|
||||||
|
"Secrets.*.toml",
|
||||||
|
"terraform.tfvars",
|
||||||
|
"*.auto.tfvars",
|
||||||
|
"terraform.tfstate",
|
||||||
|
"*.tfstate",
|
||||||
|
".docker/config.json",
|
||||||
|
"config/master.key",
|
||||||
|
"config/credentials/*.key",
|
||||||
|
".aws/credentials",
|
||||||
|
".netrc",
|
||||||
|
".htpasswd",
|
||||||
|
"service-account.json",
|
||||||
|
"service-account-key.json",
|
||||||
|
]
|
||||||
|
.into_iter()
|
||||||
|
.map(String::from)
|
||||||
|
.collect()
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Built-in key name patterns that are always treated as sensitive.
|
||||||
|
pub const BUILTIN_KEY_PATTERNS: &[&str] = &[
|
||||||
|
"*PASSWORD*",
|
||||||
|
"*PASSWD*",
|
||||||
|
"*SECRET*",
|
||||||
|
"*API_KEY*",
|
||||||
|
"*APIKEY*",
|
||||||
|
"*TOKEN*",
|
||||||
|
"*ACCESS_KEY*",
|
||||||
|
"*PRIVATE_KEY*",
|
||||||
|
"*AUTH*",
|
||||||
|
"*CREDENTIAL*",
|
||||||
|
"*CONNECTION_STRING*",
|
||||||
|
"*CONN_STR*",
|
||||||
|
"DATABASE_URL",
|
||||||
|
"REDIS_URL",
|
||||||
|
"MONGODB_URI",
|
||||||
|
"AMQP_URL",
|
||||||
|
"AWS_SECRET_ACCESS_KEY",
|
||||||
|
"AWS_ACCESS_KEY_ID",
|
||||||
|
"AWS_SESSION_TOKEN",
|
||||||
|
"GITHUB_TOKEN",
|
||||||
|
"GH_TOKEN",
|
||||||
|
"GITLAB_TOKEN",
|
||||||
|
"NPM_TOKEN",
|
||||||
|
"NODE_AUTH_TOKEN",
|
||||||
|
"STRIPE_SECRET_KEY",
|
||||||
|
"STRIPE_WEBHOOK_SECRET",
|
||||||
|
"OPENAI_API_KEY",
|
||||||
|
"ANTHROPIC_API_KEY",
|
||||||
|
"SENTRY_DSN",
|
||||||
|
"HEROKU_API_KEY",
|
||||||
|
"SENDGRID_API_KEY",
|
||||||
|
"JWT_SECRET",
|
||||||
|
"JWT_SIGNING_KEY",
|
||||||
|
"SESSION_SECRET",
|
||||||
|
"ENCRYPTION_KEY",
|
||||||
|
"ENCRYPT_KEY",
|
||||||
|
"MASTER_KEY",
|
||||||
|
"SIGNING_KEY",
|
||||||
|
"SECRET_KEY",
|
||||||
|
"SECRET_KEY_BASE",
|
||||||
|
"APP_KEY",
|
||||||
|
"NEXTAUTH_SECRET",
|
||||||
|
];
|
||||||
|
|
||||||
|
fn default_redaction_style() -> RedactionStyle {
|
||||||
|
RedactionStyle::Masked
|
||||||
|
}
|
||||||
|
|
||||||
|
fn default_heuristic_mode() -> HeuristicMode {
|
||||||
|
HeuristicMode::Enforce
|
||||||
|
}
|
||||||
|
|
||||||
|
fn default_true() -> bool {
|
||||||
|
true
|
||||||
|
}
|
||||||
|
|
||||||
|
fn default_enforcement_mode() -> EnforcementMode {
|
||||||
|
EnforcementMode::Permissive
|
||||||
|
}
|
||||||
|
|
||||||
|
fn default_parse_error_action() -> ParseErrorAction {
|
||||||
|
ParseErrorAction::MaskEntireFile
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Default impls
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
impl Default for SecretsConfig {
|
||||||
|
fn default() -> Self {
|
||||||
|
Self {
|
||||||
|
files: FilesConfig::default(),
|
||||||
|
keys: KeysConfig::default(),
|
||||||
|
redaction: RedactionConfig::default(),
|
||||||
|
heuristic: HeuristicConfig::default(),
|
||||||
|
enforcement: EnforcementConfig::default(),
|
||||||
|
file_overrides: Vec::new(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Default for FilesConfig {
|
||||||
|
fn default() -> Self {
|
||||||
|
Self {
|
||||||
|
patterns: default_file_patterns(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Default for KeysConfig {
|
||||||
|
fn default() -> Self {
|
||||||
|
Self {
|
||||||
|
include: Vec::new(),
|
||||||
|
exclude: Vec::new(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Default for RedactionConfig {
|
||||||
|
fn default() -> Self {
|
||||||
|
Self {
|
||||||
|
style: default_redaction_style(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Default for HeuristicConfig {
|
||||||
|
fn default() -> Self {
|
||||||
|
Self {
|
||||||
|
enabled: default_true(),
|
||||||
|
mode: default_heuristic_mode(),
|
||||||
|
patterns: Vec::new(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Default for EnforcementConfig {
|
||||||
|
fn default() -> Self {
|
||||||
|
Self {
|
||||||
|
mode: default_enforcement_mode(),
|
||||||
|
on_parse_error: default_parse_error_action(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Partial layer (for merge)
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// A partially-specified config layer parsed from a single `.botsecrets` file.
|
||||||
|
/// `Option`-wrapped fields distinguish "absent" from "explicitly set".
|
||||||
|
#[derive(Debug, Clone, Default, Deserialize)]
|
||||||
|
struct PartialSecretsConfig {
|
||||||
|
#[serde(default)]
|
||||||
|
files: Option<PartialFilesConfig>,
|
||||||
|
#[serde(default)]
|
||||||
|
keys: Option<PartialKeysConfig>,
|
||||||
|
#[serde(default)]
|
||||||
|
redaction: Option<PartialRedactionConfig>,
|
||||||
|
#[serde(default)]
|
||||||
|
heuristic: Option<PartialHeuristicConfig>,
|
||||||
|
#[serde(default)]
|
||||||
|
enforcement: Option<PartialEnforcementConfig>,
|
||||||
|
#[serde(default, rename = "file")]
|
||||||
|
file: Option<Vec<FileOverride>>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Default, Deserialize)]
|
||||||
|
struct PartialFilesConfig {
|
||||||
|
patterns: Option<Vec<String>>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Default, Deserialize)]
|
||||||
|
struct PartialKeysConfig {
|
||||||
|
include: Option<Vec<String>>,
|
||||||
|
exclude: Option<Vec<String>>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Default, Deserialize)]
|
||||||
|
struct PartialRedactionConfig {
|
||||||
|
style: Option<RedactionStyle>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Default, Deserialize)]
|
||||||
|
struct PartialHeuristicConfig {
|
||||||
|
enabled: Option<bool>,
|
||||||
|
mode: Option<HeuristicMode>,
|
||||||
|
patterns: Option<Vec<String>>,
|
||||||
|
}
|
||||||
|
|
||||||
|
#[derive(Debug, Clone, Default, Deserialize)]
|
||||||
|
struct PartialEnforcementConfig {
|
||||||
|
mode: Option<EnforcementMode>,
|
||||||
|
on_parse_error: Option<ParseErrorAction>,
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Merge logic
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
impl SecretsConfig {
|
||||||
|
/// Apply a partial layer on top of `self`.
|
||||||
|
///
|
||||||
|
/// - Vec fields (`files.patterns`, `heuristic.patterns`, `file_overrides`):
|
||||||
|
/// **replaced** by the layer's value when present.
|
||||||
|
/// - `keys.include` / `keys.exclude`: **accumulated** (appended).
|
||||||
|
/// - Scalar fields: overwritten when present in the layer.
|
||||||
|
fn merge_layer(&mut self, layer: PartialSecretsConfig) {
|
||||||
|
// files
|
||||||
|
if let Some(f) = layer.files {
|
||||||
|
if let Some(patterns) = f.patterns {
|
||||||
|
self.files.patterns = patterns;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// keys (accumulate)
|
||||||
|
if let Some(k) = layer.keys {
|
||||||
|
if let Some(inc) = k.include {
|
||||||
|
self.keys.include.extend(inc);
|
||||||
|
}
|
||||||
|
if let Some(exc) = k.exclude {
|
||||||
|
self.keys.exclude.extend(exc);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// redaction
|
||||||
|
if let Some(r) = layer.redaction {
|
||||||
|
if let Some(style) = r.style {
|
||||||
|
self.redaction.style = style;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// heuristic
|
||||||
|
if let Some(h) = layer.heuristic {
|
||||||
|
if let Some(enabled) = h.enabled {
|
||||||
|
self.heuristic.enabled = enabled;
|
||||||
|
}
|
||||||
|
if let Some(mode) = h.mode {
|
||||||
|
self.heuristic.mode = mode;
|
||||||
|
}
|
||||||
|
if let Some(patterns) = h.patterns {
|
||||||
|
self.heuristic.patterns = patterns;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// enforcement
|
||||||
|
if let Some(e) = layer.enforcement {
|
||||||
|
if let Some(mode) = e.mode {
|
||||||
|
self.enforcement.mode = mode;
|
||||||
|
}
|
||||||
|
if let Some(action) = e.on_parse_error {
|
||||||
|
self.enforcement.on_parse_error = action;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// file overrides (replace)
|
||||||
|
if let Some(overrides) = layer.file {
|
||||||
|
self.file_overrides = overrides;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Loading & discovery
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// Return the user-global fermata config directory.
|
||||||
|
/// `~/.config/fermata` on Unix, `%APPDATA%/fermata` on Windows.
|
||||||
|
fn user_config_dir() -> Option<PathBuf> {
|
||||||
|
#[cfg(unix)]
|
||||||
|
{
|
||||||
|
std::env::var_os("HOME").map(|h| PathBuf::from(h).join(".config").join("fermata"))
|
||||||
|
}
|
||||||
|
#[cfg(windows)]
|
||||||
|
{
|
||||||
|
std::env::var_os("APPDATA").map(|a| PathBuf::from(a).join("fermata"))
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl SecretsConfig {
|
||||||
|
/// Load `.botsecrets` configuration for a project.
|
||||||
|
///
|
||||||
|
/// Merges layers in order (most-specific wins):
|
||||||
|
/// 1. Built-in defaults
|
||||||
|
/// 2. `~/.config/fermata/.botsecrets`
|
||||||
|
/// 3. `<root>/.botsecrets`
|
||||||
|
/// 4. `<root>/.botsecrets.local`
|
||||||
|
pub fn load(root: &Path) -> Result<Self, SecretsConfigError> {
|
||||||
|
let mut config = Self::default();
|
||||||
|
|
||||||
|
// Layer 2: user-global
|
||||||
|
if let Some(user_dir) = user_config_dir() {
|
||||||
|
let user_file = user_dir.join(".botsecrets");
|
||||||
|
if user_file.is_file() {
|
||||||
|
let layer = Self::read_partial(&user_file)?;
|
||||||
|
config.merge_layer(layer);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Layer 3: project root
|
||||||
|
let project_file = root.join(".botsecrets");
|
||||||
|
if project_file.is_file() {
|
||||||
|
let layer = Self::read_partial(&project_file)?;
|
||||||
|
config.merge_layer(layer);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Layer 4: local overrides
|
||||||
|
let local_file = root.join(".botsecrets.local");
|
||||||
|
if local_file.is_file() {
|
||||||
|
let layer = Self::read_partial(&local_file)?;
|
||||||
|
config.merge_layer(layer);
|
||||||
|
}
|
||||||
|
|
||||||
|
Ok(config)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Parse a single `.botsecrets` file into a partial layer.
|
||||||
|
fn read_partial(path: &Path) -> Result<PartialSecretsConfig, SecretsConfigError> {
|
||||||
|
let text = std::fs::read_to_string(path).map_err(|e| SecretsConfigError::Io {
|
||||||
|
path: path.to_path_buf(),
|
||||||
|
source: e,
|
||||||
|
})?;
|
||||||
|
toml::from_str(&text).map_err(|e| SecretsConfigError::Parse {
|
||||||
|
path: path.to_path_buf(),
|
||||||
|
source: e,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Load from a TOML string (useful for testing and embedding).
|
||||||
|
pub fn from_toml(toml_str: &str) -> Result<Self, toml::de::Error> {
|
||||||
|
toml::from_str(toml_str)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Returns the effective key-include patterns: built-in defaults + user
|
||||||
|
/// `keys.include`, minus any pattern that appears in `keys.exclude`.
|
||||||
|
pub fn effective_key_includes(&self) -> Vec<String> {
|
||||||
|
let mut patterns: Vec<String> = BUILTIN_KEY_PATTERNS
|
||||||
|
.iter()
|
||||||
|
.map(|s| (*s).to_owned())
|
||||||
|
.collect();
|
||||||
|
patterns.extend(self.keys.include.iter().cloned());
|
||||||
|
|
||||||
|
// Remove excluded patterns (exact string match).
|
||||||
|
if !self.keys.exclude.is_empty() {
|
||||||
|
let exclude_set: std::collections::HashSet<&str> =
|
||||||
|
self.keys.exclude.iter().map(|s| s.as_str()).collect();
|
||||||
|
patterns.retain(|p| !exclude_set.contains(p.as_str()));
|
||||||
|
}
|
||||||
|
|
||||||
|
patterns
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Check whether `key` matches any of the effective key-include patterns.
|
||||||
|
///
|
||||||
|
/// Matching is case-insensitive and uses glob semantics (`*` wildcards).
|
||||||
|
pub fn key_matches(&self, key: &str) -> bool {
|
||||||
|
let patterns = self.effective_key_includes();
|
||||||
|
let upper = key.to_ascii_uppercase();
|
||||||
|
|
||||||
|
for pat in &patterns {
|
||||||
|
let pat_upper = pat.to_ascii_uppercase();
|
||||||
|
// Build a glob matcher. Patterns without path separators are
|
||||||
|
// matched as plain globs against the key name.
|
||||||
|
if let Ok(glob) = Glob::new(&pat_upper) {
|
||||||
|
let matcher: GlobMatcher = glob.compile_matcher();
|
||||||
|
if matcher.is_match(&upper) {
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
false
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,310 @@
|
|||||||
|
//! Secret manifest loader.
|
||||||
|
//!
|
||||||
|
//! Discovers secret files per the `.botsecrets` configuration, parses them,
|
||||||
|
//! filters by key patterns, and produces the known-secrets set that the
|
||||||
|
//! Redactor will consume.
|
||||||
|
|
||||||
|
use std::path::{Path, PathBuf};
|
||||||
|
|
||||||
|
use globset::{Glob, GlobSetBuilder};
|
||||||
|
use thiserror::Error;
|
||||||
|
use walkdir::WalkDir;
|
||||||
|
|
||||||
|
use super::config::{ParseErrorAction, SecretsConfig};
|
||||||
|
use super::parser::{self, FileFormat, ParseError, SecretEntry};
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Errors
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[derive(Debug, Error)]
|
||||||
|
pub enum ManifestError {
|
||||||
|
#[error(transparent)]
|
||||||
|
Parse(#[from] ParseError),
|
||||||
|
#[error("glob pattern error: {0}")]
|
||||||
|
Glob(String),
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Manifest
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// The complete set of known secrets discovered from a project.
|
||||||
|
///
|
||||||
|
/// Entries are sorted by value length descending (longest first) so the
|
||||||
|
/// redactor replaces the most specific match before shorter substrings.
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
pub struct Manifest {
|
||||||
|
entries: Vec<SecretEntry>,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Minimum secret value length to keep. Anything shorter risks false-positive
|
||||||
|
/// redaction (e.g. `"yes"`, `"on"`, `"42"`).
|
||||||
|
const MIN_VALUE_LEN: usize = 4;
|
||||||
|
|
||||||
|
/// Directories that are unconditionally skipped during file discovery.
|
||||||
|
const SKIP_DIRS: &[&str] = &[".git", "node_modules", "target", "__pycache__", ".venv"];
|
||||||
|
|
||||||
|
impl Manifest {
|
||||||
|
/// Build a manifest by discovering and parsing secret files relative to
|
||||||
|
/// `root`.
|
||||||
|
pub fn build(config: &SecretsConfig, root: &Path) -> Result<Self, ManifestError> {
|
||||||
|
let mut entries = Vec::new();
|
||||||
|
|
||||||
|
// 1. Discover files matching `config.files.patterns`.
|
||||||
|
let discovered = discover_files(&config.files.patterns, root)?;
|
||||||
|
|
||||||
|
// 2. Parse each discovered file.
|
||||||
|
for path in &discovered {
|
||||||
|
match parse_discovered_file(path) {
|
||||||
|
Ok(file_entries) => entries.extend(file_entries),
|
||||||
|
Err(e) => match config.enforcement.on_parse_error {
|
||||||
|
ParseErrorAction::Allow => {
|
||||||
|
eprintln!(
|
||||||
|
"fermata: warning: skipping unparseable file {}: {}",
|
||||||
|
path.display(),
|
||||||
|
e
|
||||||
|
);
|
||||||
|
}
|
||||||
|
ParseErrorAction::Deny => {
|
||||||
|
return Err(e.into());
|
||||||
|
}
|
||||||
|
ParseErrorAction::MaskEntireFile => {
|
||||||
|
// We cannot extract individual secrets — the redactor
|
||||||
|
// may choose to mask the entire file content if it
|
||||||
|
// appears in output. For now we log and continue.
|
||||||
|
eprintln!(
|
||||||
|
"fermata: warning: cannot parse {}: {}",
|
||||||
|
path.display(),
|
||||||
|
e
|
||||||
|
);
|
||||||
|
}
|
||||||
|
},
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// 3. Filter discovered entries by the effective key patterns.
|
||||||
|
entries = filter_by_key_patterns(entries, config);
|
||||||
|
|
||||||
|
// 4. Process explicit `[[file]]` overrides — these bypass key filtering
|
||||||
|
// because the user declared them intentionally.
|
||||||
|
for override_cfg in &config.file_overrides {
|
||||||
|
let override_path = root.join(&override_cfg.path);
|
||||||
|
if !override_path.is_file() {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
let format = override_cfg
|
||||||
|
.format
|
||||||
|
.as_deref()
|
||||||
|
.and_then(FileFormat::from_hint);
|
||||||
|
|
||||||
|
let key_filter = if override_cfg.keys.is_empty() {
|
||||||
|
None
|
||||||
|
} else {
|
||||||
|
Some(override_cfg.keys.as_slice())
|
||||||
|
};
|
||||||
|
|
||||||
|
match parser::parse_secret_file(&override_path, format, key_filter) {
|
||||||
|
Ok(file_entries) => entries.extend(file_entries),
|
||||||
|
Err(e) => {
|
||||||
|
eprintln!(
|
||||||
|
"fermata: warning: cannot parse override file {}: {}",
|
||||||
|
override_path.display(),
|
||||||
|
e
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// 5. Deduplicate (same key + value from different discovery paths).
|
||||||
|
entries.sort_by(|a, b| a.key.cmp(&b.key).then_with(|| a.value.cmp(&b.value)));
|
||||||
|
entries.dedup_by(|a, b| a.key == b.key && a.value == b.value);
|
||||||
|
|
||||||
|
// 6. Sort by value length descending (longest first for redaction).
|
||||||
|
entries.sort_by(|a, b| b.value.len().cmp(&a.value.len()));
|
||||||
|
|
||||||
|
// 7. Remove entries with very short values to avoid false replacements.
|
||||||
|
entries.retain(|e| e.value.len() >= MIN_VALUE_LEN);
|
||||||
|
|
||||||
|
Ok(Self { entries })
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Build a manifest from a pre-built list of secret entries.
|
||||||
|
///
|
||||||
|
/// Applies the same post-processing as [`Manifest::build`]:
|
||||||
|
/// - Deduplicates entries with the same key and value.
|
||||||
|
/// - Sorts by value length descending (longest first for redaction).
|
||||||
|
/// - Removes entries with values shorter than 4 characters.
|
||||||
|
///
|
||||||
|
/// Useful for testing and for library consumers that obtain secrets
|
||||||
|
/// from sources other than filesystem discovery.
|
||||||
|
pub fn from_entries(mut entries: Vec<SecretEntry>) -> Self {
|
||||||
|
// Deduplicate (same key + value).
|
||||||
|
entries.sort_by(|a, b| a.key.cmp(&b.key).then_with(|| a.value.cmp(&b.value)));
|
||||||
|
entries.dedup_by(|a, b| a.key == b.key && a.value == b.value);
|
||||||
|
|
||||||
|
// Sort by value length descending (longest first for redaction).
|
||||||
|
entries.sort_by(|a, b| b.value.len().cmp(&a.value.len()));
|
||||||
|
|
||||||
|
// Remove entries with very short values to avoid false replacements.
|
||||||
|
entries.retain(|e| e.value.len() >= MIN_VALUE_LEN);
|
||||||
|
|
||||||
|
Self { entries }
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Build an empty manifest (no secrets known).
|
||||||
|
pub fn empty() -> Self {
|
||||||
|
Self {
|
||||||
|
entries: Vec::new(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Returns all discovered secret entries.
|
||||||
|
pub fn entries(&self) -> &[SecretEntry] {
|
||||||
|
&self.entries
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Returns `true` if the manifest contains no secrets.
|
||||||
|
pub fn is_empty(&self) -> bool {
|
||||||
|
self.entries.is_empty()
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Number of known secrets.
|
||||||
|
pub fn len(&self) -> usize {
|
||||||
|
self.entries.len()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// File discovery
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// Walk the project tree and collect files matching any of the given glob
|
||||||
|
/// patterns. Patterns are matched against paths *relative to* `root`.
|
||||||
|
fn discover_files(patterns: &[String], root: &Path) -> Result<Vec<PathBuf>, ManifestError> {
|
||||||
|
if patterns.is_empty() {
|
||||||
|
return Ok(Vec::new());
|
||||||
|
}
|
||||||
|
|
||||||
|
// Compile all patterns into a single GlobSet for efficient matching.
|
||||||
|
let mut builder = GlobSetBuilder::new();
|
||||||
|
for pat in patterns {
|
||||||
|
// `globset` patterns match against the full relative path including
|
||||||
|
// intermediate directories (e.g. `.docker/config.json`). We add
|
||||||
|
// both the literal pattern and a `**/` prefixed variant so that
|
||||||
|
// `.env` matches at the root and `subdir/.env` matches nested.
|
||||||
|
let glob = Glob::new(pat).map_err(|e| ManifestError::Glob(e.to_string()))?;
|
||||||
|
builder.add(glob);
|
||||||
|
|
||||||
|
// Also match nested occurrences: `**/<pattern>`.
|
||||||
|
if !pat.contains('/') {
|
||||||
|
let nested = format!("**/{pat}");
|
||||||
|
let nested_glob =
|
||||||
|
Glob::new(&nested).map_err(|e| ManifestError::Glob(e.to_string()))?;
|
||||||
|
builder.add(nested_glob);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
let glob_set = builder.build().map_err(|e| ManifestError::Glob(e.to_string()))?;
|
||||||
|
|
||||||
|
let mut result = Vec::new();
|
||||||
|
|
||||||
|
for entry in WalkDir::new(root).follow_links(false) {
|
||||||
|
let entry = match entry {
|
||||||
|
Ok(e) => e,
|
||||||
|
Err(_) => continue,
|
||||||
|
};
|
||||||
|
|
||||||
|
// Skip common large / non-project directories.
|
||||||
|
if entry.file_type().is_dir() {
|
||||||
|
if let Some(name) = entry.file_name().to_str() {
|
||||||
|
if SKIP_DIRS.contains(&name) {
|
||||||
|
// WalkDir does not support in-place skip, but we simply
|
||||||
|
// won't match anything under these dirs because we check
|
||||||
|
// the dir name on each entry. We continue and let non-file
|
||||||
|
// entries fall through.
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
continue; // Only interested in files.
|
||||||
|
}
|
||||||
|
|
||||||
|
if !entry.file_type().is_file() {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Check that no ancestor directory is in the skip list.
|
||||||
|
let abs_path = entry.path();
|
||||||
|
if has_skipped_ancestor(abs_path, root) {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Match relative path against the glob set.
|
||||||
|
let rel = match abs_path.strip_prefix(root) {
|
||||||
|
Ok(r) => r,
|
||||||
|
Err(_) => continue,
|
||||||
|
};
|
||||||
|
|
||||||
|
if glob_set.is_match(rel) {
|
||||||
|
result.push(abs_path.to_path_buf());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
Ok(result)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Returns `true` if any path component between `root` and `path` is in
|
||||||
|
/// [`SKIP_DIRS`].
|
||||||
|
fn has_skipped_ancestor(path: &Path, root: &Path) -> bool {
|
||||||
|
if let Ok(rel) = path.strip_prefix(root) {
|
||||||
|
for component in rel.parent().into_iter().flat_map(|p| p.components()) {
|
||||||
|
if let Some(name) = component.as_os_str().to_str() {
|
||||||
|
if SKIP_DIRS.contains(&name) {
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
false
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Single-file parsing
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// Parse a single discovered file. Auto-detects format from extension.
|
||||||
|
/// Returns an empty `Vec` if the format cannot be determined (e.g. `.key`,
|
||||||
|
/// `.pem` — opaque/binary files).
|
||||||
|
fn parse_discovered_file(path: &Path) -> Result<Vec<SecretEntry>, ParseError> {
|
||||||
|
let format = match FileFormat::from_path(path) {
|
||||||
|
Some(fmt) => fmt,
|
||||||
|
None => return Ok(Vec::new()), // opaque file — skip
|
||||||
|
};
|
||||||
|
parser::parse_secret_file(path, Some(format), None)
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Key-pattern filtering
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// Keep only entries whose key matches the effective key-include patterns
|
||||||
|
/// from the configuration.
|
||||||
|
fn filter_by_key_patterns(entries: Vec<SecretEntry>, config: &SecretsConfig) -> Vec<SecretEntry> {
|
||||||
|
entries
|
||||||
|
.into_iter()
|
||||||
|
.filter(|e| config.key_matches(&e.key))
|
||||||
|
.collect()
|
||||||
|
}
|
||||||
|
|
||||||
|
#[cfg(test)]
|
||||||
|
mod tests {
|
||||||
|
use super::*;
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn empty_manifest() {
|
||||||
|
let m = Manifest::empty();
|
||||||
|
assert!(m.is_empty());
|
||||||
|
assert_eq!(m.len(), 0);
|
||||||
|
assert!(m.entries().is_empty());
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,15 @@
|
|||||||
|
//! Secret-filtering configuration (`.botsecrets` files), multi-format
|
||||||
|
//! secret file parsing, and heuristic scanning.
|
||||||
|
|
||||||
|
pub mod config;
|
||||||
|
pub mod manifest;
|
||||||
|
pub mod parser;
|
||||||
|
pub mod patterns;
|
||||||
|
pub mod redactor;
|
||||||
|
pub mod scanner;
|
||||||
|
|
||||||
|
pub use config::SecretsConfig;
|
||||||
|
pub use manifest::{Manifest, ManifestError};
|
||||||
|
pub use parser::{parse_secret_file, FileFormat, ParseError, SecretEntry};
|
||||||
|
pub use redactor::{RedactedText, Redaction, Redactor};
|
||||||
|
pub use scanner::{Confidence, Finding, Scanner};
|
||||||
@@ -0,0 +1,517 @@
|
|||||||
|
//! Multi-format secret file parser.
|
||||||
|
//!
|
||||||
|
//! Reads secret files (`.env`, TOML, JSON, YAML, Python assignments,
|
||||||
|
//! Java `.properties`) and extracts key-value pairs as [`SecretEntry`] items.
|
||||||
|
//! Nested structures are flattened with dot-separated keys.
|
||||||
|
|
||||||
|
use globset::Glob;
|
||||||
|
use std::path::{Path, PathBuf};
|
||||||
|
use thiserror::Error;
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Errors
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[derive(Debug, Error)]
|
||||||
|
pub enum ParseError {
|
||||||
|
#[error("io error reading {path}: {source}")]
|
||||||
|
Io {
|
||||||
|
path: PathBuf,
|
||||||
|
source: std::io::Error,
|
||||||
|
},
|
||||||
|
#[error("parse error in {path}: {message}")]
|
||||||
|
Format { path: PathBuf, message: String },
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Types
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// A single secret extracted from a file.
|
||||||
|
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||||
|
pub struct SecretEntry {
|
||||||
|
/// The key name (e.g. `"DATABASE_URL"`, `"spring.datasource.password"`).
|
||||||
|
pub key: String,
|
||||||
|
/// The secret value.
|
||||||
|
pub value: String,
|
||||||
|
/// Which file the entry came from.
|
||||||
|
pub source: PathBuf,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Supported secret-file formats.
|
||||||
|
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||||
|
pub enum FileFormat {
|
||||||
|
/// `.env` / dotenv files.
|
||||||
|
Env,
|
||||||
|
/// TOML files (e.g. `Secrets.toml`).
|
||||||
|
Toml,
|
||||||
|
/// JSON files.
|
||||||
|
Json,
|
||||||
|
/// YAML files.
|
||||||
|
Yaml,
|
||||||
|
/// Python-style assignments: `KEY = "value"` or `KEY = 'value'`.
|
||||||
|
PythonAssignments,
|
||||||
|
/// Java `.properties` files: `key=value` or `key: value`.
|
||||||
|
Properties,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl FileFormat {
|
||||||
|
/// Guess format from file extension/name.
|
||||||
|
pub fn from_path(path: &Path) -> Option<Self> {
|
||||||
|
let name = path.file_name()?.to_str()?;
|
||||||
|
let ext = path.extension().and_then(|e| e.to_str());
|
||||||
|
|
||||||
|
// .env, .env.local, .env.production, etc.
|
||||||
|
if name.starts_with(".env") || name.ends_with(".env") {
|
||||||
|
return Some(Self::Env);
|
||||||
|
}
|
||||||
|
|
||||||
|
match ext {
|
||||||
|
Some("toml") => Some(Self::Toml),
|
||||||
|
Some("json") => Some(Self::Json),
|
||||||
|
Some("yaml" | "yml") => Some(Self::Yaml),
|
||||||
|
Some("py") => Some(Self::PythonAssignments),
|
||||||
|
Some("properties") => Some(Self::Properties),
|
||||||
|
// .key, .pem, etc. are binary/opaque — not parseable as key-value.
|
||||||
|
_ => None,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Parse from the `format` string used in `.botsecrets` `[[file]]` overrides.
|
||||||
|
pub fn from_hint(hint: &str) -> Option<Self> {
|
||||||
|
match hint {
|
||||||
|
"env" | "dotenv" => Some(Self::Env),
|
||||||
|
"toml" => Some(Self::Toml),
|
||||||
|
"json" => Some(Self::Json),
|
||||||
|
"yaml" | "yml" => Some(Self::Yaml),
|
||||||
|
"python-assignments" | "python" => Some(Self::PythonAssignments),
|
||||||
|
"properties" | "java-properties" => Some(Self::Properties),
|
||||||
|
_ => None,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Public API
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// Parse a secret file and extract key-value entries.
|
||||||
|
///
|
||||||
|
/// If `format` is `None`, auto-detects from path extension.
|
||||||
|
/// If `key_filter` is `Some`, only entries whose keys match at least one
|
||||||
|
/// glob pattern are returned.
|
||||||
|
pub fn parse_secret_file(
|
||||||
|
path: &Path,
|
||||||
|
format: Option<FileFormat>,
|
||||||
|
key_filter: Option<&[String]>,
|
||||||
|
) -> Result<Vec<SecretEntry>, ParseError> {
|
||||||
|
let content = std::fs::read_to_string(path).map_err(|e| ParseError::Io {
|
||||||
|
path: path.to_path_buf(),
|
||||||
|
source: e,
|
||||||
|
})?;
|
||||||
|
|
||||||
|
let fmt = format
|
||||||
|
.or_else(|| FileFormat::from_path(path))
|
||||||
|
.ok_or_else(|| ParseError::Format {
|
||||||
|
path: path.to_path_buf(),
|
||||||
|
message: "cannot determine file format".into(),
|
||||||
|
})?;
|
||||||
|
|
||||||
|
let entries = parse_content(&content, fmt, path)?;
|
||||||
|
|
||||||
|
match key_filter {
|
||||||
|
Some(keys) => Ok(filter_entries(entries, keys)),
|
||||||
|
None => Ok(entries),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Parse content string without reading from disk (useful for testing).
|
||||||
|
pub fn parse_content(
|
||||||
|
content: &str,
|
||||||
|
format: FileFormat,
|
||||||
|
source: &Path,
|
||||||
|
) -> Result<Vec<SecretEntry>, ParseError> {
|
||||||
|
match format {
|
||||||
|
FileFormat::Env => parse_env(content, source),
|
||||||
|
FileFormat::Toml => parse_toml(content, source),
|
||||||
|
FileFormat::Json => parse_json(content, source),
|
||||||
|
FileFormat::Yaml => parse_yaml(content, source),
|
||||||
|
FileFormat::PythonAssignments => parse_python_assignments(content, source),
|
||||||
|
FileFormat::Properties => parse_properties(content, source),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Format parsers
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// Parse `.env` / dotenv files.
|
||||||
|
///
|
||||||
|
/// Supports `KEY=VALUE`, `KEY="VALUE"`, `KEY='VALUE'`, and the `export`
|
||||||
|
/// prefix. Comments (`#`) and empty lines are skipped.
|
||||||
|
fn parse_env(content: &str, source: &Path) -> Result<Vec<SecretEntry>, ParseError> {
|
||||||
|
let mut entries = Vec::new();
|
||||||
|
|
||||||
|
for line in content.lines() {
|
||||||
|
let trimmed = line.trim();
|
||||||
|
|
||||||
|
// Skip blank lines and comments.
|
||||||
|
if trimmed.is_empty() || trimmed.starts_with('#') {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Strip optional `export ` prefix.
|
||||||
|
let trimmed = trimmed
|
||||||
|
.strip_prefix("export ")
|
||||||
|
.or_else(|| trimmed.strip_prefix("export\t"))
|
||||||
|
.unwrap_or(trimmed);
|
||||||
|
|
||||||
|
// Split on first `=`.
|
||||||
|
let Some((key, raw_value)) = trimmed.split_once('=') else {
|
||||||
|
continue;
|
||||||
|
};
|
||||||
|
|
||||||
|
let key = key.trim().to_string();
|
||||||
|
if key.is_empty() {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
let value = strip_env_value(raw_value);
|
||||||
|
|
||||||
|
entries.push(SecretEntry {
|
||||||
|
key,
|
||||||
|
value,
|
||||||
|
source: source.to_path_buf(),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
Ok(entries)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Strip surrounding quotes and trailing whitespace from an env value.
|
||||||
|
fn strip_env_value(raw: &str) -> String {
|
||||||
|
let trimmed = raw.trim();
|
||||||
|
|
||||||
|
// Double-quoted value.
|
||||||
|
if trimmed.starts_with('"') && trimmed.ends_with('"') && trimmed.len() >= 2 {
|
||||||
|
let inner = &trimmed[1..trimmed.len() - 1];
|
||||||
|
// Interpret common escape sequences.
|
||||||
|
return inner.replace("\\n", "\n").replace("\\t", "\t");
|
||||||
|
}
|
||||||
|
|
||||||
|
// Single-quoted value (literal, no escapes).
|
||||||
|
if trimmed.starts_with('\'') && trimmed.ends_with('\'') && trimmed.len() >= 2 {
|
||||||
|
return trimmed[1..trimmed.len() - 1].to_string();
|
||||||
|
}
|
||||||
|
|
||||||
|
// Unquoted — trim trailing whitespace (already trimmed above) and strip
|
||||||
|
// inline comments.
|
||||||
|
if let Some(pos) = trimmed.find(" #") {
|
||||||
|
trimmed[..pos].trim_end().to_string()
|
||||||
|
} else {
|
||||||
|
trimmed.to_string()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Parse TOML files. Nested tables are flattened with dot separators.
|
||||||
|
/// Only string values are extracted.
|
||||||
|
fn parse_toml(content: &str, source: &Path) -> Result<Vec<SecretEntry>, ParseError> {
|
||||||
|
let table: toml::Value = content.parse().map_err(|e: toml::de::Error| ParseError::Format {
|
||||||
|
path: source.to_path_buf(),
|
||||||
|
message: e.to_string(),
|
||||||
|
})?;
|
||||||
|
|
||||||
|
let mut entries = Vec::new();
|
||||||
|
flatten_toml_value(&table, "", source, &mut entries);
|
||||||
|
Ok(entries)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn flatten_toml_value(
|
||||||
|
value: &toml::Value,
|
||||||
|
prefix: &str,
|
||||||
|
source: &Path,
|
||||||
|
entries: &mut Vec<SecretEntry>,
|
||||||
|
) {
|
||||||
|
match value {
|
||||||
|
toml::Value::String(s) => {
|
||||||
|
if !prefix.is_empty() {
|
||||||
|
entries.push(SecretEntry {
|
||||||
|
key: prefix.to_string(),
|
||||||
|
value: s.clone(),
|
||||||
|
source: source.to_path_buf(),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
toml::Value::Table(map) => {
|
||||||
|
for (k, v) in map {
|
||||||
|
let key = if prefix.is_empty() {
|
||||||
|
k.clone()
|
||||||
|
} else {
|
||||||
|
format!("{prefix}.{k}")
|
||||||
|
};
|
||||||
|
flatten_toml_value(v, &key, source, entries);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
toml::Value::Array(arr) => {
|
||||||
|
for (i, v) in arr.iter().enumerate() {
|
||||||
|
let key = if prefix.is_empty() {
|
||||||
|
i.to_string()
|
||||||
|
} else {
|
||||||
|
format!("{prefix}.{i}")
|
||||||
|
};
|
||||||
|
flatten_toml_value(v, &key, source, entries);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// Integer, Float, Boolean, Datetime — skip, not secrets.
|
||||||
|
_ => {}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Parse JSON files. Nested objects are flattened with dot separators.
|
||||||
|
/// Arrays use numeric indices. Only string values are extracted.
|
||||||
|
fn parse_json(content: &str, source: &Path) -> Result<Vec<SecretEntry>, ParseError> {
|
||||||
|
let value: serde_json::Value =
|
||||||
|
serde_json::from_str(content).map_err(|e| ParseError::Format {
|
||||||
|
path: source.to_path_buf(),
|
||||||
|
message: e.to_string(),
|
||||||
|
})?;
|
||||||
|
|
||||||
|
let mut entries = Vec::new();
|
||||||
|
flatten_json_value(&value, "", source, &mut entries);
|
||||||
|
Ok(entries)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn flatten_json_value(
|
||||||
|
value: &serde_json::Value,
|
||||||
|
prefix: &str,
|
||||||
|
source: &Path,
|
||||||
|
entries: &mut Vec<SecretEntry>,
|
||||||
|
) {
|
||||||
|
match value {
|
||||||
|
serde_json::Value::String(s) => {
|
||||||
|
if !prefix.is_empty() {
|
||||||
|
entries.push(SecretEntry {
|
||||||
|
key: prefix.to_string(),
|
||||||
|
value: s.clone(),
|
||||||
|
source: source.to_path_buf(),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
serde_json::Value::Object(map) => {
|
||||||
|
for (k, v) in map {
|
||||||
|
let key = if prefix.is_empty() {
|
||||||
|
k.clone()
|
||||||
|
} else {
|
||||||
|
format!("{prefix}.{k}")
|
||||||
|
};
|
||||||
|
flatten_json_value(v, &key, source, entries);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
serde_json::Value::Array(arr) => {
|
||||||
|
for (i, v) in arr.iter().enumerate() {
|
||||||
|
let key = if prefix.is_empty() {
|
||||||
|
i.to_string()
|
||||||
|
} else {
|
||||||
|
format!("{prefix}.{i}")
|
||||||
|
};
|
||||||
|
flatten_json_value(v, &key, source, entries);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// Number, Bool, Null — skip.
|
||||||
|
_ => {}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Parse YAML files. Nested mappings are flattened with dot separators.
|
||||||
|
/// Only string values are extracted.
|
||||||
|
fn parse_yaml(content: &str, source: &Path) -> Result<Vec<SecretEntry>, ParseError> {
|
||||||
|
let value: serde_yaml::Value =
|
||||||
|
serde_yaml::from_str(content).map_err(|e| ParseError::Format {
|
||||||
|
path: source.to_path_buf(),
|
||||||
|
message: e.to_string(),
|
||||||
|
})?;
|
||||||
|
|
||||||
|
let mut entries = Vec::new();
|
||||||
|
flatten_yaml_value(&value, "", source, &mut entries);
|
||||||
|
Ok(entries)
|
||||||
|
}
|
||||||
|
|
||||||
|
fn flatten_yaml_value(
|
||||||
|
value: &serde_yaml::Value,
|
||||||
|
prefix: &str,
|
||||||
|
source: &Path,
|
||||||
|
entries: &mut Vec<SecretEntry>,
|
||||||
|
) {
|
||||||
|
match value {
|
||||||
|
serde_yaml::Value::String(s) => {
|
||||||
|
if !prefix.is_empty() {
|
||||||
|
entries.push(SecretEntry {
|
||||||
|
key: prefix.to_string(),
|
||||||
|
value: s.clone(),
|
||||||
|
source: source.to_path_buf(),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
serde_yaml::Value::Mapping(map) => {
|
||||||
|
for (k, v) in map {
|
||||||
|
let k_str = match k {
|
||||||
|
serde_yaml::Value::String(s) => s.clone(),
|
||||||
|
other => format!("{other:?}"),
|
||||||
|
};
|
||||||
|
let key = if prefix.is_empty() {
|
||||||
|
k_str
|
||||||
|
} else {
|
||||||
|
format!("{prefix}.{k_str}")
|
||||||
|
};
|
||||||
|
flatten_yaml_value(v, &key, source, entries);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
serde_yaml::Value::Sequence(arr) => {
|
||||||
|
for (i, v) in arr.iter().enumerate() {
|
||||||
|
let key = if prefix.is_empty() {
|
||||||
|
i.to_string()
|
||||||
|
} else {
|
||||||
|
format!("{prefix}.{i}")
|
||||||
|
};
|
||||||
|
flatten_yaml_value(v, &key, source, entries);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// Number, Bool, Null, Tagged — skip.
|
||||||
|
_ => {}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Parse Python-style assignment lines: `KEY = "value"` or `KEY = 'value'`.
|
||||||
|
///
|
||||||
|
/// This is heuristic — lines that don't match the pattern are silently skipped.
|
||||||
|
fn parse_python_assignments(
|
||||||
|
content: &str,
|
||||||
|
source: &Path,
|
||||||
|
) -> Result<Vec<SecretEntry>, ParseError> {
|
||||||
|
let re = regex::Regex::new(r#"(?i)^([A-Z_][A-Z0-9_]*)\s*=\s*['"](.+?)['"]\s*$"#)
|
||||||
|
.expect("valid regex");
|
||||||
|
|
||||||
|
let mut entries = Vec::new();
|
||||||
|
|
||||||
|
for line in content.lines() {
|
||||||
|
let trimmed = line.trim();
|
||||||
|
if trimmed.is_empty() || trimmed.starts_with('#') {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
if let Some(caps) = re.captures(trimmed) {
|
||||||
|
entries.push(SecretEntry {
|
||||||
|
key: caps[1].to_string(),
|
||||||
|
value: caps[2].to_string(),
|
||||||
|
source: source.to_path_buf(),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
Ok(entries)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Parse Java `.properties` files.
|
||||||
|
///
|
||||||
|
/// Supports `key=value`, `key: value`, `key value` (space separator).
|
||||||
|
/// Lines starting with `#` or `!` are comments. Continuation lines ending
|
||||||
|
/// with `\` are joined.
|
||||||
|
fn parse_properties(content: &str, source: &Path) -> Result<Vec<SecretEntry>, ParseError> {
|
||||||
|
let mut entries = Vec::new();
|
||||||
|
let mut lines = content.lines().peekable();
|
||||||
|
|
||||||
|
while let Some(line) = lines.next() {
|
||||||
|
let trimmed = line.trim();
|
||||||
|
|
||||||
|
// Skip blank lines and comments.
|
||||||
|
if trimmed.is_empty() || trimmed.starts_with('#') || trimmed.starts_with('!') {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Handle continuation lines (trailing `\`).
|
||||||
|
let mut logical_line = String::new();
|
||||||
|
let mut current = trimmed.to_string();
|
||||||
|
while current.ends_with('\\') {
|
||||||
|
// Remove trailing backslash and append next line.
|
||||||
|
logical_line.push_str(¤t[..current.len() - 1]);
|
||||||
|
current = lines
|
||||||
|
.next()
|
||||||
|
.map(|l| l.trim_start().to_string())
|
||||||
|
.unwrap_or_default();
|
||||||
|
}
|
||||||
|
logical_line.push_str(¤t);
|
||||||
|
|
||||||
|
// Split on first `=`, `:`, or whitespace.
|
||||||
|
let (key, value) = split_property_line(&logical_line);
|
||||||
|
if key.is_empty() {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
|
||||||
|
entries.push(SecretEntry {
|
||||||
|
key,
|
||||||
|
value,
|
||||||
|
source: source.to_path_buf(),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
Ok(entries)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Split a logical properties line into (key, value).
|
||||||
|
/// Recognises `=`, `:`, or whitespace as the separator.
|
||||||
|
fn split_property_line(line: &str) -> (String, String) {
|
||||||
|
// Find the first unescaped separator.
|
||||||
|
let mut i = 0;
|
||||||
|
let bytes = line.as_bytes();
|
||||||
|
let len = bytes.len();
|
||||||
|
|
||||||
|
while i < len {
|
||||||
|
// Skip escaped characters.
|
||||||
|
if bytes[i] == b'\\' {
|
||||||
|
i += 2;
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
if bytes[i] == b'=' || bytes[i] == b':' {
|
||||||
|
let key = line[..i].trim().to_string();
|
||||||
|
let value = line[i + 1..].trim().to_string();
|
||||||
|
return (key, value);
|
||||||
|
}
|
||||||
|
if bytes[i] == b' ' || bytes[i] == b'\t' {
|
||||||
|
let key = line[..i].trim().to_string();
|
||||||
|
let value = line[i..].trim().to_string();
|
||||||
|
return (key, value);
|
||||||
|
}
|
||||||
|
i += 1;
|
||||||
|
}
|
||||||
|
|
||||||
|
// No separator found — the entire line is a key with an empty value.
|
||||||
|
(line.trim().to_string(), String::new())
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Key filtering
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// Filter entries by glob patterns (case-insensitive).
|
||||||
|
fn filter_entries(entries: Vec<SecretEntry>, patterns: &[String]) -> Vec<SecretEntry> {
|
||||||
|
// Pre-compile matchers.
|
||||||
|
let matchers: Vec<_> = patterns
|
||||||
|
.iter()
|
||||||
|
.filter_map(|p| {
|
||||||
|
Glob::new(&p.to_ascii_uppercase())
|
||||||
|
.ok()
|
||||||
|
.map(|g| g.compile_matcher())
|
||||||
|
})
|
||||||
|
.collect();
|
||||||
|
|
||||||
|
if matchers.is_empty() {
|
||||||
|
return Vec::new();
|
||||||
|
}
|
||||||
|
|
||||||
|
entries
|
||||||
|
.into_iter()
|
||||||
|
.filter(|entry| {
|
||||||
|
let upper = entry.key.to_ascii_uppercase();
|
||||||
|
matchers.iter().any(|m| m.is_match(&upper))
|
||||||
|
})
|
||||||
|
.collect()
|
||||||
|
}
|
||||||
@@ -0,0 +1,258 @@
|
|||||||
|
//! Built-in regex patterns for heuristic secret detection.
|
||||||
|
//!
|
||||||
|
//! Rules are derived from [gitleaks](https://github.com/gitleaks/gitleaks) (MIT license)
|
||||||
|
//! and curated for high-confidence detection in AI agent output streams.
|
||||||
|
|
||||||
|
use std::borrow::Cow;
|
||||||
|
|
||||||
|
/// A single detection rule.
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
pub struct DetectionRule {
|
||||||
|
/// Unique identifier (e.g. `"aws-access-key"`, `"github-pat"`).
|
||||||
|
pub id: Cow<'static, str>,
|
||||||
|
/// Human-readable description.
|
||||||
|
pub description: Cow<'static, str>,
|
||||||
|
/// Regex pattern string.
|
||||||
|
pub pattern: Cow<'static, str>,
|
||||||
|
/// Minimum Shannon entropy threshold for matched text.
|
||||||
|
/// `None` means no entropy check — the pattern alone is sufficient.
|
||||||
|
pub entropy_threshold: Option<f64>,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Returns the built-in detection rules (gitleaks-derived, MIT licensed).
|
||||||
|
pub fn builtin_rules() -> &'static [DetectionRule] {
|
||||||
|
&RULES
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Convenience macro to define a static rule with `Cow::Borrowed`.
|
||||||
|
macro_rules! rule {
|
||||||
|
($id:expr, $desc:expr, $pat:expr) => {
|
||||||
|
DetectionRule {
|
||||||
|
id: Cow::Borrowed($id),
|
||||||
|
description: Cow::Borrowed($desc),
|
||||||
|
pattern: Cow::Borrowed($pat),
|
||||||
|
entropy_threshold: None,
|
||||||
|
}
|
||||||
|
};
|
||||||
|
($id:expr, $desc:expr, $pat:expr, entropy: $threshold:expr) => {
|
||||||
|
DetectionRule {
|
||||||
|
id: Cow::Borrowed($id),
|
||||||
|
description: Cow::Borrowed($desc),
|
||||||
|
pattern: Cow::Borrowed($pat),
|
||||||
|
entropy_threshold: Some($threshold),
|
||||||
|
}
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
static RULES: [DetectionRule; 35] = [
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
// Cloud provider keys
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
rule!(
|
||||||
|
"aws-access-key",
|
||||||
|
"AWS Access Key ID",
|
||||||
|
r"(?:A3T[A-Z0-9]|AKIA|ASIA|ABIA|ACCA)[A-Z2-7]{16}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"aws-secret-key",
|
||||||
|
"AWS Secret Access Key (near assignment)",
|
||||||
|
r"(?i)aws[_\-\.]?secret[_\-\.]?access[_\-\.]?key[\s]*[=:\s]+[\s]*['\x22]?([A-Za-z0-9/+=]{40})['\x22]?"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"gcp-api-key",
|
||||||
|
"GCP API Key",
|
||||||
|
r"AIza[0-9A-Za-z\-_]{35}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"gcp-service-account",
|
||||||
|
"GCP Service Account JSON",
|
||||||
|
r#"\x22type\x22\s*:\s*\x22service_account\x22"#
|
||||||
|
),
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
// Code hosting tokens
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
rule!(
|
||||||
|
"github-pat-fine-grained",
|
||||||
|
"GitHub Fine-Grained Personal Access Token",
|
||||||
|
r"github_pat_[A-Za-z0-9_]{82}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"github-pat-classic",
|
||||||
|
"GitHub Classic Personal Access Token",
|
||||||
|
r"ghp_[A-Za-z0-9]{36}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"github-oauth",
|
||||||
|
"GitHub OAuth Access Token",
|
||||||
|
r"gho_[A-Za-z0-9]{36}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"github-app-user-token",
|
||||||
|
"GitHub App User-to-Server Token",
|
||||||
|
r"ghu_[A-Za-z0-9]{36}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"github-app-server-token",
|
||||||
|
"GitHub App Server-to-Server Token",
|
||||||
|
r"ghs_[A-Za-z0-9]{36}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"gitlab-pat",
|
||||||
|
"GitLab Personal Access Token",
|
||||||
|
r"glpat-[A-Za-z0-9\-_]{20,}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"gitlab-pipeline-token",
|
||||||
|
"GitLab Pipeline Trigger Token",
|
||||||
|
r"glptt-[A-Za-z0-9\-_]{20,}"
|
||||||
|
),
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
// Payment
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
rule!(
|
||||||
|
"stripe-secret-key",
|
||||||
|
"Stripe Secret Key",
|
||||||
|
r"sk_live_[A-Za-z0-9]{24,}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"stripe-restricted-key",
|
||||||
|
"Stripe Restricted Key",
|
||||||
|
r"rk_live_[A-Za-z0-9]{24,}"
|
||||||
|
),
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
// Communication
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
rule!(
|
||||||
|
"slack-bot-token",
|
||||||
|
"Slack Bot Token",
|
||||||
|
r"xoxb-[0-9]{10,}-[0-9]{10,}-[A-Za-z0-9]{24,}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"slack-user-token",
|
||||||
|
"Slack User Token",
|
||||||
|
r"xoxp-[0-9]{10,}-[0-9]{10,}-[0-9]{10,}-[a-z0-9]{32}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"slack-webhook",
|
||||||
|
"Slack Incoming Webhook URL",
|
||||||
|
r"https://hooks\.slack\.com/services/T[A-Z0-9]{8,}/B[A-Z0-9]{8,}/[A-Za-z0-9]{24,}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"twilio-api-key",
|
||||||
|
"Twilio API Key",
|
||||||
|
r"SK[a-f0-9]{32}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"sendgrid-api-key",
|
||||||
|
"SendGrid API Key",
|
||||||
|
r"SG\.[A-Za-z0-9_\-]{22}\.[A-Za-z0-9_\-]{43}"
|
||||||
|
),
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
// Auth / Identity
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
rule!(
|
||||||
|
"jwt",
|
||||||
|
"JSON Web Token",
|
||||||
|
r"eyJ[A-Za-z0-9_-]{10,}\.eyJ[A-Za-z0-9_-]{10,}\.[A-Za-z0-9_\-]{10,}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"bearer-token",
|
||||||
|
"Bearer Token in Authorization Header",
|
||||||
|
r"(?i)bearer\s+[A-Za-z0-9\-._~+/]+=*"
|
||||||
|
),
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
// Cryptographic material
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
rule!(
|
||||||
|
"private-key-header",
|
||||||
|
"Private Key (PEM Header)",
|
||||||
|
r"-----BEGIN\s?(?:RSA |DSA |EC |PGP |OPENSSH )?PRIVATE KEY-----"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"pgp-private-key",
|
||||||
|
"PGP Private Key Block",
|
||||||
|
r"-----BEGIN PGP PRIVATE KEY BLOCK-----"
|
||||||
|
),
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
// Database
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
rule!(
|
||||||
|
"database-connection-url",
|
||||||
|
"Database Connection URL with Credentials",
|
||||||
|
r"(?i)(?:postgres|mysql|mongodb|redis|amqp)://[^:\s]+:[^@\s]+@[^\s]+"
|
||||||
|
),
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
// Infrastructure
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
rule!(
|
||||||
|
"heroku-api-key",
|
||||||
|
"Heroku API Key",
|
||||||
|
r"(?i)heroku[_\-\.]?api[_\-\.]?key[\s]*[=:\s]+[\s]*[A-Fa-f0-9]{8}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{4}-[A-Fa-f0-9]{12}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"npm-token",
|
||||||
|
"npm Access Token",
|
||||||
|
r"(?i)npm_[A-Za-z0-9]{36}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"pypi-token",
|
||||||
|
"PyPI API Token",
|
||||||
|
r"pypi-[A-Za-z0-9_\-]{50,}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"docker-hub-token",
|
||||||
|
"Docker Hub Personal Access Token",
|
||||||
|
r"dckr_pat_[A-Za-z0-9_\-]{27,}"
|
||||||
|
),
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
// AI / ML
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
rule!(
|
||||||
|
"openai-api-key-legacy",
|
||||||
|
"OpenAI API Key (Legacy Format)",
|
||||||
|
r"sk-[A-Za-z0-9]{20}T3BlbkFJ[A-Za-z0-9]{20}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"openai-project-key",
|
||||||
|
"OpenAI Project API Key",
|
||||||
|
r"sk-proj-[A-Za-z0-9\-_]{40,}"
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"anthropic-api-key",
|
||||||
|
"Anthropic API Key",
|
||||||
|
r"sk-ant-[A-Za-z0-9\-_]{40,}"
|
||||||
|
),
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
// Generic patterns (entropy-gated)
|
||||||
|
// -----------------------------------------------------------------------
|
||||||
|
rule!(
|
||||||
|
"generic-api-key",
|
||||||
|
"Generic API Key Assignment",
|
||||||
|
r"(?i)(?:api[_\-]?key|apikey)[\s]*[=:]\s*['\x22]?([A-Za-z0-9_\-]{20,})['\x22]?",
|
||||||
|
entropy: 3.5
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"generic-secret",
|
||||||
|
"Generic Secret/Password/Token Assignment",
|
||||||
|
r"(?i)(?:secret|password|passwd|token)[\s]*[=:]\s*['\x22]?([^\s'\x22]{8,})['\x22]?",
|
||||||
|
entropy: 3.0
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"generic-private-key",
|
||||||
|
"Generic Private Key Assignment",
|
||||||
|
r"(?i)private[_\-]?key[\s]*[=:]\s*['\x22]?([^\s'\x22]{20,})['\x22]?",
|
||||||
|
entropy: 3.5
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"high-entropy-hex",
|
||||||
|
"High-Entropy Hex String (32+ chars)",
|
||||||
|
r"(?i)[=:]\s*['\x22]?([0-9a-f]{32,})['\x22]?",
|
||||||
|
entropy: 3.5
|
||||||
|
),
|
||||||
|
rule!(
|
||||||
|
"high-entropy-base64",
|
||||||
|
"High-Entropy Base64 String (24+ chars)",
|
||||||
|
r"(?i)[=:]\s*['\x22]?([A-Za-z0-9+/]{24,}={0,3})['\x22]?",
|
||||||
|
entropy: 4.0
|
||||||
|
),
|
||||||
|
];
|
||||||
@@ -0,0 +1,172 @@
|
|||||||
|
//! Secret value redactor.
|
||||||
|
//!
|
||||||
|
//! Takes the known-secrets [`Manifest`] and efficiently replaces every
|
||||||
|
//! occurrence of a secret value in arbitrary text using an Aho-Corasick
|
||||||
|
//! automaton for multi-pattern matching.
|
||||||
|
|
||||||
|
use aho_corasick::AhoCorasick;
|
||||||
|
|
||||||
|
use super::config::RedactionStyle;
|
||||||
|
use super::manifest::Manifest;
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Output types
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// A redaction event -- records what was replaced and where.
|
||||||
|
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||||
|
pub struct Redaction {
|
||||||
|
/// The key name of the redacted secret.
|
||||||
|
pub key: String,
|
||||||
|
/// Byte offset in the *original* text where the match starts.
|
||||||
|
pub offset: usize,
|
||||||
|
/// Length (in bytes) of the original secret value that was replaced.
|
||||||
|
pub original_len: usize,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// The result of redacting text.
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
pub struct RedactedText {
|
||||||
|
/// The text with secret values replaced.
|
||||||
|
pub text: String,
|
||||||
|
/// List of redactions that were applied (in order of occurrence).
|
||||||
|
pub redactions: Vec<Redaction>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl RedactedText {
|
||||||
|
/// Returns `true` if any redactions were made.
|
||||||
|
pub fn was_redacted(&self) -> bool {
|
||||||
|
!self.redactions.is_empty()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Redactor
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// Replaces known secret values in text with configurable placeholders.
|
||||||
|
///
|
||||||
|
/// Construction is cheap when the manifest is empty and O(n) in the total
|
||||||
|
/// length of secret values otherwise (Aho-Corasick automaton build).
|
||||||
|
/// Redaction itself is O(n) in the length of the input text.
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
pub struct Redactor {
|
||||||
|
/// The Aho-Corasick automaton for multi-pattern matching.
|
||||||
|
/// `None` when the manifest is empty (no-op fast path).
|
||||||
|
automaton: Option<AhoCorasick>,
|
||||||
|
/// Secret entries parallel to the automaton patterns.
|
||||||
|
/// Index `i` in the automaton corresponds to `entries[i]`.
|
||||||
|
entries: Vec<RedactorEntry>,
|
||||||
|
/// How to format replacements.
|
||||||
|
style: RedactionStyle,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Internal entry -- stores info needed for replacement formatting.
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
struct RedactorEntry {
|
||||||
|
key: String,
|
||||||
|
value_len: usize,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Redactor {
|
||||||
|
/// Build a redactor from a manifest and redaction style.
|
||||||
|
///
|
||||||
|
/// The manifest entries are already sorted by value length descending,
|
||||||
|
/// but Aho-Corasick with `LeftmostLongest` handles overlap correctly
|
||||||
|
/// regardless of input order.
|
||||||
|
pub fn new(manifest: &Manifest, style: RedactionStyle) -> Self {
|
||||||
|
let secrets = manifest.entries();
|
||||||
|
if secrets.is_empty() {
|
||||||
|
return Self {
|
||||||
|
automaton: None,
|
||||||
|
entries: Vec::new(),
|
||||||
|
style,
|
||||||
|
};
|
||||||
|
}
|
||||||
|
|
||||||
|
// Build patterns from secret *values* (not keys).
|
||||||
|
let patterns: Vec<&str> = secrets.iter().map(|e| e.value.as_str()).collect();
|
||||||
|
let entries: Vec<RedactorEntry> = secrets
|
||||||
|
.iter()
|
||||||
|
.map(|e| RedactorEntry {
|
||||||
|
key: e.key.clone(),
|
||||||
|
value_len: e.value.len(),
|
||||||
|
})
|
||||||
|
.collect();
|
||||||
|
|
||||||
|
// LeftmostLongest ensures that when one secret value is a substring
|
||||||
|
// of another, the longer match wins.
|
||||||
|
let automaton = AhoCorasick::builder()
|
||||||
|
.match_kind(aho_corasick::MatchKind::LeftmostLongest)
|
||||||
|
.build(&patterns)
|
||||||
|
.ok(); // If build fails (shouldn't for valid strings), fall back to no-op.
|
||||||
|
|
||||||
|
Self {
|
||||||
|
automaton,
|
||||||
|
entries,
|
||||||
|
style,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Redact all known secret values in the input text.
|
||||||
|
///
|
||||||
|
/// Returns the redacted text together with metadata about each
|
||||||
|
/// replacement (key name, byte offset, original length).
|
||||||
|
pub fn redact(&self, text: &str) -> RedactedText {
|
||||||
|
let automaton = match &self.automaton {
|
||||||
|
Some(a) => a,
|
||||||
|
None => {
|
||||||
|
return RedactedText {
|
||||||
|
text: text.to_string(),
|
||||||
|
redactions: Vec::new(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
let mut result = String::with_capacity(text.len());
|
||||||
|
let mut redactions = Vec::new();
|
||||||
|
let mut last_end = 0;
|
||||||
|
|
||||||
|
for mat in automaton.find_iter(text) {
|
||||||
|
let entry = &self.entries[mat.pattern().as_usize()];
|
||||||
|
|
||||||
|
// Append text before the match.
|
||||||
|
result.push_str(&text[last_end..mat.start()]);
|
||||||
|
|
||||||
|
// Append the replacement placeholder.
|
||||||
|
let replacement = self.format_replacement(entry);
|
||||||
|
result.push_str(&replacement);
|
||||||
|
|
||||||
|
redactions.push(Redaction {
|
||||||
|
key: entry.key.clone(),
|
||||||
|
offset: mat.start(),
|
||||||
|
original_len: entry.value_len,
|
||||||
|
});
|
||||||
|
|
||||||
|
last_end = mat.end();
|
||||||
|
}
|
||||||
|
|
||||||
|
// Append remaining text after the last match.
|
||||||
|
result.push_str(&text[last_end..]);
|
||||||
|
|
||||||
|
RedactedText {
|
||||||
|
text: result,
|
||||||
|
redactions,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Format the replacement string according to the configured style.
|
||||||
|
fn format_replacement(&self, entry: &RedactorEntry) -> String {
|
||||||
|
match self.style {
|
||||||
|
RedactionStyle::Masked => "*****".to_string(),
|
||||||
|
RedactionStyle::Typed => format!("<REDACTED:string:{}>", entry.value_len),
|
||||||
|
RedactionStyle::Named => format!("<REDACTED:{}>", entry.key),
|
||||||
|
RedactionStyle::Absent => String::new(),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Returns `true` if this redactor has any secrets loaded.
|
||||||
|
pub fn has_secrets(&self) -> bool {
|
||||||
|
self.automaton.is_some()
|
||||||
|
}
|
||||||
|
}
|
||||||
@@ -0,0 +1,250 @@
|
|||||||
|
//! Heuristic secret scanner using [`RegexSet`] for single-pass multi-pattern
|
||||||
|
//! matching with optional Shannon entropy filtering.
|
||||||
|
//!
|
||||||
|
//! The scanner operates purely on text input — it has no knowledge of redaction,
|
||||||
|
//! manifests, or file structure. Callers feed it text and receive [`Finding`]s.
|
||||||
|
|
||||||
|
use std::borrow::Cow;
|
||||||
|
use std::ops::Range;
|
||||||
|
|
||||||
|
use regex::{Regex, RegexSet};
|
||||||
|
use thiserror::Error;
|
||||||
|
|
||||||
|
use super::config::HeuristicConfig;
|
||||||
|
use super::patterns::{self, DetectionRule};
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Errors
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[derive(Debug, Error)]
|
||||||
|
pub enum ScannerError {
|
||||||
|
#[error("invalid regex pattern: {0}")]
|
||||||
|
Regex(#[from] regex::Error),
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Finding types
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// Confidence level of a finding.
|
||||||
|
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
|
||||||
|
pub enum Confidence {
|
||||||
|
/// Specific provider pattern matched (e.g. `ghp_`, `AKIA`).
|
||||||
|
High,
|
||||||
|
/// Generic pattern matched and passed entropy threshold.
|
||||||
|
Medium,
|
||||||
|
/// Generic pattern matched but entropy was borderline.
|
||||||
|
Low,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// A single potential secret detected in the input text.
|
||||||
|
#[derive(Debug, Clone)]
|
||||||
|
pub struct Finding {
|
||||||
|
/// The matched text substring.
|
||||||
|
pub matched_text: String,
|
||||||
|
/// Which detection rule triggered this finding.
|
||||||
|
pub pattern_id: String,
|
||||||
|
/// Human-readable description of the rule.
|
||||||
|
pub description: String,
|
||||||
|
/// Confidence level.
|
||||||
|
pub confidence: Confidence,
|
||||||
|
/// Byte range in the input text.
|
||||||
|
pub span: Range<usize>,
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Scanner
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// Pre-compiled multi-pattern secret scanner.
|
||||||
|
///
|
||||||
|
/// Holds a [`RegexSet`] for fast "any match?" bulk filtering and parallel
|
||||||
|
/// individual [`Regex`] instances for extracting match details and spans.
|
||||||
|
pub struct Scanner {
|
||||||
|
/// Pre-compiled set for fast "any match?" check.
|
||||||
|
regex_set: RegexSet,
|
||||||
|
/// Individual compiled regexes for extracting match details (parallel to `regex_set`).
|
||||||
|
regexes: Vec<Regex>,
|
||||||
|
/// Rule metadata (parallel to `regex_set`).
|
||||||
|
rules: Vec<DetectionRule>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl std::fmt::Debug for Scanner {
|
||||||
|
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||||
|
f.debug_struct("Scanner")
|
||||||
|
.field("rule_count", &self.rules.len())
|
||||||
|
.finish()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
impl Scanner {
|
||||||
|
/// Build a scanner from the given heuristic configuration.
|
||||||
|
///
|
||||||
|
/// Includes all built-in rules plus any custom patterns from `config.patterns`.
|
||||||
|
pub fn new(config: &HeuristicConfig) -> Result<Self, ScannerError> {
|
||||||
|
let mut rules: Vec<DetectionRule> = patterns::builtin_rules().to_vec();
|
||||||
|
|
||||||
|
// Append custom patterns from config.
|
||||||
|
for (i, pat) in config.patterns.iter().enumerate() {
|
||||||
|
rules.push(DetectionRule {
|
||||||
|
id: Cow::Owned(format!("custom-{i}")),
|
||||||
|
description: Cow::Owned(format!("Custom pattern #{i}")),
|
||||||
|
pattern: Cow::Owned(pat.clone()),
|
||||||
|
entropy_threshold: None,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
let pattern_strings: Vec<&str> = rules.iter().map(|r| r.pattern.as_ref()).collect();
|
||||||
|
|
||||||
|
let regex_set = RegexSet::new(&pattern_strings)?;
|
||||||
|
let regexes = pattern_strings
|
||||||
|
.iter()
|
||||||
|
.map(|p| Regex::new(p))
|
||||||
|
.collect::<Result<Vec<_>, _>>()?;
|
||||||
|
|
||||||
|
Ok(Self {
|
||||||
|
regex_set,
|
||||||
|
regexes,
|
||||||
|
rules,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Build a scanner with only the built-in rules (no custom patterns).
|
||||||
|
pub fn builtin() -> Result<Self, ScannerError> {
|
||||||
|
Self::new(&HeuristicConfig::default())
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Scan `text` for potential secrets.
|
||||||
|
///
|
||||||
|
/// Returns findings sorted by byte position with overlapping matches
|
||||||
|
/// deduplicated (first match wins).
|
||||||
|
pub fn scan(&self, text: &str) -> Vec<Finding> {
|
||||||
|
let matches = self.regex_set.matches(text);
|
||||||
|
if !matches.matched_any() {
|
||||||
|
return Vec::new();
|
||||||
|
}
|
||||||
|
|
||||||
|
let mut findings = Vec::new();
|
||||||
|
|
||||||
|
for idx in matches.iter() {
|
||||||
|
let rule = &self.rules[idx];
|
||||||
|
let regex = &self.regexes[idx];
|
||||||
|
|
||||||
|
for mat in regex.find_iter(text) {
|
||||||
|
let matched_text = mat.as_str();
|
||||||
|
|
||||||
|
// Apply entropy threshold when configured.
|
||||||
|
if let Some(threshold) = rule.entropy_threshold {
|
||||||
|
if shannon_entropy(matched_text) < threshold {
|
||||||
|
continue;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
let confidence = if rule.entropy_threshold.is_some() {
|
||||||
|
Confidence::Medium
|
||||||
|
} else {
|
||||||
|
Confidence::High
|
||||||
|
};
|
||||||
|
|
||||||
|
findings.push(Finding {
|
||||||
|
matched_text: matched_text.to_string(),
|
||||||
|
pattern_id: rule.id.to_string(),
|
||||||
|
description: rule.description.to_string(),
|
||||||
|
confidence,
|
||||||
|
span: mat.start()..mat.end(),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Sort by position, then deduplicate overlapping spans.
|
||||||
|
findings.sort_by_key(|f| f.span.start);
|
||||||
|
dedup_overlapping(&mut findings);
|
||||||
|
findings
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Returns the number of active rules (built-in + custom).
|
||||||
|
pub fn rule_count(&self) -> usize {
|
||||||
|
self.rules.len()
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Shannon entropy
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// Calculate Shannon entropy of `s` in bits per character.
|
||||||
|
///
|
||||||
|
/// Returns 0.0 for empty strings. Maximum entropy for ASCII printable text
|
||||||
|
/// is ~6.57 bits/char.
|
||||||
|
pub fn shannon_entropy(s: &str) -> f64 {
|
||||||
|
if s.is_empty() {
|
||||||
|
return 0.0;
|
||||||
|
}
|
||||||
|
let mut freq = [0u32; 256];
|
||||||
|
let len = s.len() as f64;
|
||||||
|
for &b in s.as_bytes() {
|
||||||
|
freq[b as usize] += 1;
|
||||||
|
}
|
||||||
|
freq.iter()
|
||||||
|
.filter(|&&c| c > 0)
|
||||||
|
.map(|&c| {
|
||||||
|
let p = c as f64 / len;
|
||||||
|
-p * p.log2()
|
||||||
|
})
|
||||||
|
.sum()
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Deduplication
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// Remove findings whose span overlaps with an earlier (higher-priority) finding.
|
||||||
|
///
|
||||||
|
/// Input must be sorted by `span.start`. When two findings overlap, the one
|
||||||
|
/// appearing first (lower start position) is kept.
|
||||||
|
fn dedup_overlapping(findings: &mut Vec<Finding>) {
|
||||||
|
let mut i = 0;
|
||||||
|
while i < findings.len() {
|
||||||
|
let end = findings[i].span.end;
|
||||||
|
let mut j = i + 1;
|
||||||
|
while j < findings.len() {
|
||||||
|
if findings[j].span.start < end {
|
||||||
|
findings.remove(j);
|
||||||
|
} else {
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
i += 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[cfg(test)]
|
||||||
|
mod tests {
|
||||||
|
use super::*;
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn entropy_all_same_chars() {
|
||||||
|
// All same characters → 0 entropy.
|
||||||
|
assert!((shannon_entropy("aaaaaaaaaa") - 0.0).abs() < f64::EPSILON);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn entropy_two_equal_chars() {
|
||||||
|
// "ab" repeated → exactly 1.0 bits/char.
|
||||||
|
let e = shannon_entropy("abababababababababab");
|
||||||
|
assert!((e - 1.0).abs() < 0.01);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn entropy_high_randomness() {
|
||||||
|
// A string with many distinct characters should have high entropy.
|
||||||
|
let s = "aB3$kL9!mZ7@wQ1#";
|
||||||
|
assert!(shannon_entropy(s) > 3.5);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn entropy_empty_string() {
|
||||||
|
assert!((shannon_entropy("") - 0.0).abs() < f64::EPSILON);
|
||||||
|
}
|
||||||
|
}
|
||||||
+54
-5
@@ -1,10 +1,11 @@
|
|||||||
//! Claude Code hook adapter (PreToolUse).
|
//! Claude Code hook adapter (PreToolUse + PostToolUse).
|
||||||
//!
|
//!
|
||||||
//! Wire format: stdin is one JSON object with `tool_name` and `tool_input`.
|
//! Wire format: stdin is one JSON object with `tool_name` and `tool_input`
|
||||||
|
//! (and optionally `tool_response` for PostToolUse).
|
||||||
//! Stdout is `{"hookSpecificOutput": {...}}` with exit code 0; the JSON
|
//! Stdout is `{"hookSpecificOutput": {...}}` with exit code 0; the JSON
|
||||||
//! carries the verdict.
|
//! carries the verdict / updated output.
|
||||||
|
|
||||||
use super::{AdapterError, HarnessAdapter, PathKind, ToolCall, ToolOp};
|
use super::{AdapterError, HarnessAdapter, PathKind, PostToolUsePayload, ToolCall, ToolOp};
|
||||||
use crate::core::Decision;
|
use crate::core::Decision;
|
||||||
use serde_json::{json, Value};
|
use serde_json::{json, Value};
|
||||||
use std::path::PathBuf;
|
use std::path::PathBuf;
|
||||||
@@ -16,6 +17,8 @@ impl HarnessAdapter for ClaudeAdapter {
|
|||||||
"claude"
|
"claude"
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// -- PreToolUse --------------------------------------------------------
|
||||||
|
|
||||||
fn parse_request(&self, input: &[u8]) -> Result<ToolCall, AdapterError> {
|
fn parse_request(&self, input: &[u8]) -> Result<ToolCall, AdapterError> {
|
||||||
let v: Value = serde_json::from_slice(input)?;
|
let v: Value = serde_json::from_slice(input)?;
|
||||||
let tool_name = v
|
let tool_name = v
|
||||||
@@ -39,7 +42,11 @@ impl HarnessAdapter for ClaudeAdapter {
|
|||||||
})
|
})
|
||||||
}
|
}
|
||||||
|
|
||||||
fn render_decision(&self, _call: &ToolCall, decision: &Decision) -> Result<Vec<u8>, AdapterError> {
|
fn render_decision(
|
||||||
|
&self,
|
||||||
|
_call: &ToolCall,
|
||||||
|
decision: &Decision,
|
||||||
|
) -> Result<Vec<u8>, AdapterError> {
|
||||||
let (verdict, reason) = match decision {
|
let (verdict, reason) = match decision {
|
||||||
Decision::Allow => ("allow", String::new()),
|
Decision::Allow => ("allow", String::new()),
|
||||||
Decision::Ask(r) => ("ask", r.message.clone()),
|
Decision::Ask(r) => ("ask", r.message.clone()),
|
||||||
@@ -54,6 +61,48 @@ impl HarnessAdapter for ClaudeAdapter {
|
|||||||
});
|
});
|
||||||
Ok(serde_json::to_vec(&out)?)
|
Ok(serde_json::to_vec(&out)?)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// -- PostToolUse -------------------------------------------------------
|
||||||
|
|
||||||
|
fn parse_post_tool_use(&self, input: &[u8]) -> Result<PostToolUsePayload, AdapterError> {
|
||||||
|
let v: Value = serde_json::from_slice(input)?;
|
||||||
|
let tool_name = v
|
||||||
|
.get("tool_name")
|
||||||
|
.and_then(|x| x.as_str())
|
||||||
|
.ok_or_else(|| AdapterError::Parse("missing tool_name".into()))?
|
||||||
|
.to_string();
|
||||||
|
let tool_input = v.get("tool_input").cloned().unwrap_or(Value::Null);
|
||||||
|
let tool_response = v
|
||||||
|
.get("tool_response")
|
||||||
|
.and_then(|x| x.as_str())
|
||||||
|
.unwrap_or("")
|
||||||
|
.to_string();
|
||||||
|
Ok(PostToolUsePayload {
|
||||||
|
tool_name,
|
||||||
|
tool_input,
|
||||||
|
tool_response,
|
||||||
|
raw: v,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
fn render_post_tool_use(
|
||||||
|
&self,
|
||||||
|
_payload: &PostToolUsePayload,
|
||||||
|
redacted_output: Option<&str>,
|
||||||
|
) -> Result<Vec<u8>, AdapterError> {
|
||||||
|
// When there are no changes, return `{}` — Claude Code interprets
|
||||||
|
// an empty object as "use original output, no modifications".
|
||||||
|
let out = match redacted_output {
|
||||||
|
Some(text) => json!({
|
||||||
|
"hookSpecificOutput": {
|
||||||
|
"hookEventName": "PostToolUse",
|
||||||
|
"updatedToolOutput": text,
|
||||||
|
}
|
||||||
|
}),
|
||||||
|
None => json!({}),
|
||||||
|
};
|
||||||
|
Ok(serde_json::to_vec(&out)?)
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
fn path_op(tool_input: &Value, kind: PathKind) -> Result<ToolOp, AdapterError> {
|
fn path_op(tool_input: &Value, kind: PathKind) -> Result<ToolOp, AdapterError> {
|
||||||
|
|||||||
+55
-3
@@ -42,16 +42,68 @@ pub enum PathKind {
|
|||||||
Write,
|
Write,
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Trait implemented by each harness adapter. Adapters parse the harness's
|
/// A PostToolUse hook payload -- tool already executed, output available for
|
||||||
/// hook stdin payload into `ToolCall` and render a `Decision` back to the
|
/// inspection/redaction.
|
||||||
/// harness's expected stdout format.
|
#[derive(Debug, Clone)]
|
||||||
|
pub struct PostToolUsePayload {
|
||||||
|
/// Harness's tool name (e.g. "Read", "Bash").
|
||||||
|
pub tool_name: String,
|
||||||
|
/// The tool input that was originally provided.
|
||||||
|
pub tool_input: serde_json::Value,
|
||||||
|
/// The tool's output text that may contain secrets.
|
||||||
|
pub tool_response: String,
|
||||||
|
/// Original raw payload.
|
||||||
|
pub raw: serde_json::Value,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Hook event type discriminator.
|
||||||
|
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||||
|
pub enum HookEvent {
|
||||||
|
PreToolUse,
|
||||||
|
PostToolUse,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl HookEvent {
|
||||||
|
/// Parse a hook event name from CLI or payload strings.
|
||||||
|
///
|
||||||
|
/// Accepts both kebab-case (`pre-tool-use`) and PascalCase (`PreToolUse`).
|
||||||
|
pub fn parse(s: &str) -> Option<Self> {
|
||||||
|
match s {
|
||||||
|
"pre-tool-use" | "PreToolUse" => Some(Self::PreToolUse),
|
||||||
|
"post-tool-use" | "PostToolUse" => Some(Self::PostToolUse),
|
||||||
|
_ => None,
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Trait implemented by each harness adapter.
|
||||||
|
///
|
||||||
|
/// Adapters handle both PreToolUse (policy gate) and PostToolUse (output
|
||||||
|
/// redaction) hook events.
|
||||||
pub trait HarnessAdapter {
|
pub trait HarnessAdapter {
|
||||||
/// The CLI name (e.g. "claude", "codex", "gemini").
|
/// The CLI name (e.g. "claude", "codex", "gemini").
|
||||||
fn name(&self) -> &'static str;
|
fn name(&self) -> &'static str;
|
||||||
|
|
||||||
|
// -- PreToolUse --------------------------------------------------------
|
||||||
|
|
||||||
|
/// Parse a PreToolUse hook payload into a normalized `ToolCall`.
|
||||||
fn parse_request(&self, input: &[u8]) -> Result<ToolCall, AdapterError>;
|
fn parse_request(&self, input: &[u8]) -> Result<ToolCall, AdapterError>;
|
||||||
|
|
||||||
|
/// Render a policy `Decision` back to the harness's PreToolUse wire format.
|
||||||
fn render_decision(&self, call: &ToolCall, decision: &Decision) -> Result<Vec<u8>, AdapterError>;
|
fn render_decision(&self, call: &ToolCall, decision: &Decision) -> Result<Vec<u8>, AdapterError>;
|
||||||
|
|
||||||
|
// -- PostToolUse -------------------------------------------------------
|
||||||
|
|
||||||
|
/// Parse a PostToolUse hook payload (tool name, input, response).
|
||||||
|
fn parse_post_tool_use(&self, input: &[u8]) -> Result<PostToolUsePayload, AdapterError>;
|
||||||
|
|
||||||
|
/// Render a PostToolUse response. `redacted_output` is the (possibly
|
||||||
|
/// modified) tool output to send back to the harness.
|
||||||
|
fn render_post_tool_use(
|
||||||
|
&self,
|
||||||
|
payload: &PostToolUsePayload,
|
||||||
|
redacted_output: Option<&str>,
|
||||||
|
) -> Result<Vec<u8>, AdapterError>;
|
||||||
}
|
}
|
||||||
|
|
||||||
#[cfg(feature = "harness-claude")]
|
#[cfg(feature = "harness-claude")]
|
||||||
|
|||||||
@@ -0,0 +1,298 @@
|
|||||||
|
use assert_cmd::Command;
|
||||||
|
use std::fs;
|
||||||
|
|
||||||
|
/// Helper: create a temp project directory with a `.botsecrets` config and
|
||||||
|
/// a `.env` file containing the given secrets.
|
||||||
|
fn setup_project(
|
||||||
|
env_content: &str,
|
||||||
|
botsecrets_content: Option<&str>,
|
||||||
|
) -> tempfile::TempDir {
|
||||||
|
let tmp = tempfile::tempdir().unwrap();
|
||||||
|
|
||||||
|
// .env with test secrets
|
||||||
|
fs::write(tmp.path().join(".env"), env_content).unwrap();
|
||||||
|
|
||||||
|
// .botsecrets config (use default if not specified)
|
||||||
|
let botsecrets = botsecrets_content.unwrap_or(
|
||||||
|
r#"
|
||||||
|
[files]
|
||||||
|
patterns = [".env"]
|
||||||
|
"#,
|
||||||
|
);
|
||||||
|
fs::write(tmp.path().join(".botsecrets"), botsecrets).unwrap();
|
||||||
|
|
||||||
|
// .botignore (empty — required for project root detection)
|
||||||
|
fs::write(tmp.path().join(".botignore"), "").unwrap();
|
||||||
|
|
||||||
|
tmp
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn post_tool_use_redacts_known_secret() {
|
||||||
|
let tmp = setup_project("DB_PASSWORD=supersecret123\n", None);
|
||||||
|
|
||||||
|
let payload = serde_json::json!({
|
||||||
|
"tool_name": "Read",
|
||||||
|
"tool_input": { "file_path": "/some/file.txt" },
|
||||||
|
"tool_response": "DB_HOST=localhost\nDB_PASSWORD=supersecret123\nDB_PORT=5432"
|
||||||
|
})
|
||||||
|
.to_string();
|
||||||
|
|
||||||
|
let out = Command::cargo_bin("fermata")
|
||||||
|
.unwrap()
|
||||||
|
.args(["hook", "--event", "post-tool-use", "--harness", "claude"])
|
||||||
|
.current_dir(tmp.path())
|
||||||
|
.write_stdin(payload)
|
||||||
|
.assert()
|
||||||
|
.success()
|
||||||
|
.get_output()
|
||||||
|
.stdout
|
||||||
|
.clone();
|
||||||
|
|
||||||
|
let v: serde_json::Value = serde_json::from_slice(&out).unwrap();
|
||||||
|
let updated = v["hookSpecificOutput"]["updatedToolOutput"]
|
||||||
|
.as_str()
|
||||||
|
.expect("expected updatedToolOutput");
|
||||||
|
|
||||||
|
assert!(
|
||||||
|
updated.contains("*****"),
|
||||||
|
"expected masked secret, got: {updated}"
|
||||||
|
);
|
||||||
|
assert!(
|
||||||
|
!updated.contains("supersecret123"),
|
||||||
|
"secret should be redacted, got: {updated}"
|
||||||
|
);
|
||||||
|
assert!(
|
||||||
|
updated.contains("DB_HOST=localhost"),
|
||||||
|
"non-secret lines should be preserved, got: {updated}"
|
||||||
|
);
|
||||||
|
assert!(
|
||||||
|
updated.contains("DB_PORT=5432"),
|
||||||
|
"non-secret lines should be preserved, got: {updated}"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn post_tool_use_no_secrets_passthrough() {
|
||||||
|
let tmp = setup_project("DB_PASSWORD=supersecret123\n", None);
|
||||||
|
|
||||||
|
let payload = serde_json::json!({
|
||||||
|
"tool_name": "Read",
|
||||||
|
"tool_input": { "file_path": "/some/file.txt" },
|
||||||
|
"tool_response": "Hello, world! This text has no secrets."
|
||||||
|
})
|
||||||
|
.to_string();
|
||||||
|
|
||||||
|
let out = Command::cargo_bin("fermata")
|
||||||
|
.unwrap()
|
||||||
|
.args(["hook", "--event", "post-tool-use", "--harness", "claude"])
|
||||||
|
.current_dir(tmp.path())
|
||||||
|
.write_stdin(payload)
|
||||||
|
.assert()
|
||||||
|
.success()
|
||||||
|
.get_output()
|
||||||
|
.stdout
|
||||||
|
.clone();
|
||||||
|
|
||||||
|
let v: serde_json::Value = serde_json::from_slice(&out).unwrap();
|
||||||
|
// Empty JSON object means "no changes".
|
||||||
|
assert_eq!(v, serde_json::json!({}), "expected empty JSON for passthrough");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn post_tool_use_empty_response_passthrough() {
|
||||||
|
let tmp = setup_project("DB_PASSWORD=supersecret123\n", None);
|
||||||
|
|
||||||
|
let payload = serde_json::json!({
|
||||||
|
"tool_name": "Read",
|
||||||
|
"tool_input": { "file_path": "/some/file.txt" },
|
||||||
|
"tool_response": ""
|
||||||
|
})
|
||||||
|
.to_string();
|
||||||
|
|
||||||
|
let out = Command::cargo_bin("fermata")
|
||||||
|
.unwrap()
|
||||||
|
.args(["hook", "--event", "post-tool-use", "--harness", "claude"])
|
||||||
|
.current_dir(tmp.path())
|
||||||
|
.write_stdin(payload)
|
||||||
|
.assert()
|
||||||
|
.success()
|
||||||
|
.get_output()
|
||||||
|
.stdout
|
||||||
|
.clone();
|
||||||
|
|
||||||
|
let v: serde_json::Value = serde_json::from_slice(&out).unwrap();
|
||||||
|
assert_eq!(v, serde_json::json!({}));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn post_tool_use_heuristic_enforce_appends_warning() {
|
||||||
|
// Use a config with heuristic in enforce mode (the default).
|
||||||
|
let botsecrets = r#"
|
||||||
|
[files]
|
||||||
|
patterns = [".env"]
|
||||||
|
|
||||||
|
[heuristic]
|
||||||
|
enabled = true
|
||||||
|
mode = "enforce"
|
||||||
|
"#;
|
||||||
|
let tmp = setup_project("UNRELATED_KEY=foo\n", Some(botsecrets));
|
||||||
|
|
||||||
|
// Include something that looks like a GitHub PAT (classic) in the response.
|
||||||
|
// Pattern requires `ghp_` followed by exactly 36 alphanumeric chars.
|
||||||
|
let payload = serde_json::json!({
|
||||||
|
"tool_name": "Bash",
|
||||||
|
"tool_input": { "command": "cat output.log" },
|
||||||
|
"tool_response": "deploy log: token ghp_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghij used"
|
||||||
|
})
|
||||||
|
.to_string();
|
||||||
|
|
||||||
|
let out = Command::cargo_bin("fermata")
|
||||||
|
.unwrap()
|
||||||
|
.args(["hook", "--event", "post-tool-use", "--harness", "claude"])
|
||||||
|
.current_dir(tmp.path())
|
||||||
|
.write_stdin(payload)
|
||||||
|
.assert()
|
||||||
|
.success()
|
||||||
|
.get_output()
|
||||||
|
.stdout
|
||||||
|
.clone();
|
||||||
|
|
||||||
|
let v: serde_json::Value = serde_json::from_slice(&out).unwrap();
|
||||||
|
let updated = v["hookSpecificOutput"]["updatedToolOutput"]
|
||||||
|
.as_str()
|
||||||
|
.expect("expected updatedToolOutput with heuristic warning");
|
||||||
|
assert!(
|
||||||
|
updated.contains("[fermata] WARNING"),
|
||||||
|
"expected heuristic warning, got: {updated}"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn pre_tool_use_backward_compat_default_event() {
|
||||||
|
// `--event` defaults to pre-tool-use; existing `--harness claude` still works.
|
||||||
|
let tmp = tempfile::tempdir().unwrap();
|
||||||
|
fs::write(tmp.path().join(".botignore"), ".env\n").unwrap();
|
||||||
|
let target = tmp.path().join(".env");
|
||||||
|
fs::write(&target, "").unwrap();
|
||||||
|
|
||||||
|
let payload = serde_json::json!({
|
||||||
|
"tool_name": "Read",
|
||||||
|
"tool_input": { "file_path": target.to_str().unwrap() }
|
||||||
|
})
|
||||||
|
.to_string();
|
||||||
|
|
||||||
|
let out = Command::cargo_bin("fermata")
|
||||||
|
.unwrap()
|
||||||
|
.args(["hook", "--harness", "claude"])
|
||||||
|
.write_stdin(payload)
|
||||||
|
.assert()
|
||||||
|
.success()
|
||||||
|
.get_output()
|
||||||
|
.stdout
|
||||||
|
.clone();
|
||||||
|
|
||||||
|
let v: serde_json::Value = serde_json::from_slice(&out).unwrap();
|
||||||
|
assert_eq!(v["hookSpecificOutput"]["permissionDecision"], "deny");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn pre_tool_use_explicit_event_flag() {
|
||||||
|
// Explicitly passing `--event pre-tool-use` works identically.
|
||||||
|
let tmp = tempfile::tempdir().unwrap();
|
||||||
|
fs::write(tmp.path().join(".botignore"), ".env\n").unwrap();
|
||||||
|
let target = tmp.path().join("safe.txt");
|
||||||
|
fs::write(&target, "").unwrap();
|
||||||
|
|
||||||
|
let payload = serde_json::json!({
|
||||||
|
"tool_name": "Read",
|
||||||
|
"tool_input": { "file_path": target.to_str().unwrap() }
|
||||||
|
})
|
||||||
|
.to_string();
|
||||||
|
|
||||||
|
let out = Command::cargo_bin("fermata")
|
||||||
|
.unwrap()
|
||||||
|
.args(["hook", "--event", "pre-tool-use", "--harness", "claude"])
|
||||||
|
.write_stdin(payload)
|
||||||
|
.assert()
|
||||||
|
.success()
|
||||||
|
.get_output()
|
||||||
|
.stdout
|
||||||
|
.clone();
|
||||||
|
|
||||||
|
let v: serde_json::Value = serde_json::from_slice(&out).unwrap();
|
||||||
|
assert_eq!(v["hookSpecificOutput"]["permissionDecision"], "allow");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn unknown_event_exits_2() {
|
||||||
|
Command::cargo_bin("fermata")
|
||||||
|
.unwrap()
|
||||||
|
.args(["hook", "--event", "nonsense", "--harness", "claude"])
|
||||||
|
.write_stdin("{}")
|
||||||
|
.assert()
|
||||||
|
.code(2);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn post_tool_use_no_project_root_passthrough() {
|
||||||
|
// When run in a directory with no .botignore / .botsecrets,
|
||||||
|
// PostToolUse should fail-open with `{}`.
|
||||||
|
let tmp = tempfile::tempdir().unwrap();
|
||||||
|
|
||||||
|
let payload = serde_json::json!({
|
||||||
|
"tool_name": "Read",
|
||||||
|
"tool_input": { "file_path": "/some/file.txt" },
|
||||||
|
"tool_response": "DB_PASSWORD=supersecret123"
|
||||||
|
})
|
||||||
|
.to_string();
|
||||||
|
|
||||||
|
let out = Command::cargo_bin("fermata")
|
||||||
|
.unwrap()
|
||||||
|
.args(["hook", "--event", "post-tool-use", "--harness", "claude"])
|
||||||
|
.current_dir(tmp.path())
|
||||||
|
.write_stdin(payload)
|
||||||
|
.assert()
|
||||||
|
.success()
|
||||||
|
.get_output()
|
||||||
|
.stdout
|
||||||
|
.clone();
|
||||||
|
|
||||||
|
let v: serde_json::Value = serde_json::from_slice(&out).unwrap();
|
||||||
|
assert_eq!(v, serde_json::json!({}));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn post_tool_use_multiple_secrets_redacted() {
|
||||||
|
let tmp = setup_project(
|
||||||
|
"DB_PASSWORD=supersecret123\nAPI_KEY=my-api-key-abc\n",
|
||||||
|
None,
|
||||||
|
);
|
||||||
|
|
||||||
|
let payload = serde_json::json!({
|
||||||
|
"tool_name": "Read",
|
||||||
|
"tool_input": { "file_path": "/some/config" },
|
||||||
|
"tool_response": "config: password=supersecret123, key=my-api-key-abc, host=localhost"
|
||||||
|
})
|
||||||
|
.to_string();
|
||||||
|
|
||||||
|
let out = Command::cargo_bin("fermata")
|
||||||
|
.unwrap()
|
||||||
|
.args(["hook", "--event", "post-tool-use", "--harness", "claude"])
|
||||||
|
.current_dir(tmp.path())
|
||||||
|
.write_stdin(payload)
|
||||||
|
.assert()
|
||||||
|
.success()
|
||||||
|
.get_output()
|
||||||
|
.stdout
|
||||||
|
.clone();
|
||||||
|
|
||||||
|
let v: serde_json::Value = serde_json::from_slice(&out).unwrap();
|
||||||
|
let updated = v["hookSpecificOutput"]["updatedToolOutput"]
|
||||||
|
.as_str()
|
||||||
|
.expect("expected updatedToolOutput");
|
||||||
|
|
||||||
|
assert!(!updated.contains("supersecret123"), "first secret should be redacted");
|
||||||
|
assert!(!updated.contains("my-api-key-abc"), "second secret should be redacted");
|
||||||
|
assert!(updated.contains("host=localhost"), "non-secret should be preserved");
|
||||||
|
}
|
||||||
@@ -0,0 +1,388 @@
|
|||||||
|
use dirigent_fermata::core::secrets::config::{
|
||||||
|
EnforcementMode, HeuristicMode, ParseErrorAction, RedactionStyle, SecretsConfig,
|
||||||
|
BUILTIN_KEY_PATTERNS,
|
||||||
|
};
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn parse_minimal_files_only() {
|
||||||
|
let cfg = SecretsConfig::from_toml(
|
||||||
|
r#"
|
||||||
|
[files]
|
||||||
|
patterns = [".env", ".env.*"]
|
||||||
|
"#,
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
assert_eq!(cfg.files.patterns, vec![".env", ".env.*"]);
|
||||||
|
// Other sections use defaults
|
||||||
|
assert_eq!(cfg.redaction.style, RedactionStyle::Masked);
|
||||||
|
assert_eq!(cfg.enforcement.mode, EnforcementMode::Permissive);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn parse_full_config() {
|
||||||
|
let cfg = SecretsConfig::from_toml(
|
||||||
|
r#"
|
||||||
|
[files]
|
||||||
|
patterns = [".env", "secrets.*"]
|
||||||
|
|
||||||
|
[keys]
|
||||||
|
include = ["STRIPE_*", "TWILIO_*"]
|
||||||
|
exclude = ["PUBLIC_KEY", "SSH_KEY_PATH"]
|
||||||
|
|
||||||
|
[redaction]
|
||||||
|
style = "typed"
|
||||||
|
|
||||||
|
[heuristic]
|
||||||
|
enabled = false
|
||||||
|
mode = "report"
|
||||||
|
patterns = ['AKIA[A-Z2-7]{16}']
|
||||||
|
|
||||||
|
[enforcement]
|
||||||
|
mode = "strict"
|
||||||
|
on_parse_error = "deny"
|
||||||
|
|
||||||
|
[[file]]
|
||||||
|
path = "settings.py"
|
||||||
|
format = "python-assignments"
|
||||||
|
keys = ["SECRET_KEY", "DATABASES.*.PASSWORD"]
|
||||||
|
"#,
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
assert_eq!(cfg.files.patterns, vec![".env", "secrets.*"]);
|
||||||
|
assert_eq!(cfg.keys.include, vec!["STRIPE_*", "TWILIO_*"]);
|
||||||
|
assert_eq!(cfg.keys.exclude, vec!["PUBLIC_KEY", "SSH_KEY_PATH"]);
|
||||||
|
assert_eq!(cfg.redaction.style, RedactionStyle::Typed);
|
||||||
|
assert!(!cfg.heuristic.enabled);
|
||||||
|
assert_eq!(cfg.heuristic.mode, HeuristicMode::Report);
|
||||||
|
assert_eq!(cfg.heuristic.patterns, vec!["AKIA[A-Z2-7]{16}"]);
|
||||||
|
assert_eq!(cfg.enforcement.mode, EnforcementMode::Strict);
|
||||||
|
assert_eq!(cfg.enforcement.on_parse_error, ParseErrorAction::Deny);
|
||||||
|
assert_eq!(cfg.file_overrides.len(), 1);
|
||||||
|
assert_eq!(cfg.file_overrides[0].path, "settings.py");
|
||||||
|
assert_eq!(
|
||||||
|
cfg.file_overrides[0].format.as_deref(),
|
||||||
|
Some("python-assignments")
|
||||||
|
);
|
||||||
|
assert_eq!(
|
||||||
|
cfg.file_overrides[0].keys,
|
||||||
|
vec!["SECRET_KEY", "DATABASES.*.PASSWORD"]
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn empty_toml_returns_defaults() {
|
||||||
|
let cfg = SecretsConfig::from_toml("").unwrap();
|
||||||
|
assert!(!cfg.files.patterns.is_empty());
|
||||||
|
assert!(cfg.files.patterns.contains(&".env".to_string()));
|
||||||
|
assert_eq!(cfg.redaction.style, RedactionStyle::Masked);
|
||||||
|
assert!(cfg.heuristic.enabled);
|
||||||
|
assert_eq!(cfg.heuristic.mode, HeuristicMode::Enforce);
|
||||||
|
assert_eq!(cfg.enforcement.mode, EnforcementMode::Permissive);
|
||||||
|
assert_eq!(
|
||||||
|
cfg.enforcement.on_parse_error,
|
||||||
|
ParseErrorAction::MaskEntireFile
|
||||||
|
);
|
||||||
|
assert!(cfg.file_overrides.is_empty());
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn invalid_toml_produces_error() {
|
||||||
|
let result = SecretsConfig::from_toml("this is not valid {{ toml");
|
||||||
|
assert!(result.is_err());
|
||||||
|
let err_msg = result.unwrap_err().to_string();
|
||||||
|
assert!(
|
||||||
|
err_msg.contains("expected"),
|
||||||
|
"error should describe parse issue: {err_msg}"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn effective_key_includes_has_builtins() {
|
||||||
|
let cfg = SecretsConfig::default();
|
||||||
|
let effective = cfg.effective_key_includes();
|
||||||
|
for builtin in BUILTIN_KEY_PATTERNS {
|
||||||
|
assert!(
|
||||||
|
effective.contains(&builtin.to_string()),
|
||||||
|
"missing builtin: {builtin}"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn effective_key_includes_adds_user_patterns() {
|
||||||
|
let cfg = SecretsConfig::from_toml(
|
||||||
|
r#"
|
||||||
|
[keys]
|
||||||
|
include = ["MY_CUSTOM_SECRET_*"]
|
||||||
|
"#,
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
let effective = cfg.effective_key_includes();
|
||||||
|
assert!(effective.contains(&"MY_CUSTOM_SECRET_*".to_string()));
|
||||||
|
// Builtins still present
|
||||||
|
assert!(effective.contains(&"*PASSWORD*".to_string()));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn effective_key_includes_removes_excluded() {
|
||||||
|
let cfg = SecretsConfig::from_toml(
|
||||||
|
r#"
|
||||||
|
[keys]
|
||||||
|
exclude = ["*TOKEN*", "SENTRY_DSN"]
|
||||||
|
"#,
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
let effective = cfg.effective_key_includes();
|
||||||
|
assert!(
|
||||||
|
!effective.contains(&"*TOKEN*".to_string()),
|
||||||
|
"excluded pattern should be removed"
|
||||||
|
);
|
||||||
|
assert!(
|
||||||
|
!effective.contains(&"SENTRY_DSN".to_string()),
|
||||||
|
"excluded pattern should be removed"
|
||||||
|
);
|
||||||
|
// Other builtins still present
|
||||||
|
assert!(effective.contains(&"*PASSWORD*".to_string()));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn key_matches_glob_case_insensitive() {
|
||||||
|
let cfg = SecretsConfig::default();
|
||||||
|
assert!(cfg.key_matches("DATABASE_URL"));
|
||||||
|
assert!(cfg.key_matches("database_url"));
|
||||||
|
assert!(cfg.key_matches("my_password_here"));
|
||||||
|
assert!(cfg.key_matches("MY_PASSWORD_HERE"));
|
||||||
|
assert!(cfg.key_matches("STRIPE_SECRET_KEY"));
|
||||||
|
assert!(cfg.key_matches("AWS_ACCESS_KEY_ID"));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn key_matches_non_secret_keys() {
|
||||||
|
let cfg = SecretsConfig::default();
|
||||||
|
assert!(!cfg.key_matches("DEBUG"));
|
||||||
|
assert!(!cfg.key_matches("LOG_LEVEL"));
|
||||||
|
assert!(!cfg.key_matches("PORT"));
|
||||||
|
assert!(!cfg.key_matches("HOST"));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn key_matches_respects_user_include() {
|
||||||
|
let cfg = SecretsConfig::from_toml(
|
||||||
|
r#"
|
||||||
|
[keys]
|
||||||
|
include = ["MY_APP_*"]
|
||||||
|
"#,
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
assert!(cfg.key_matches("MY_APP_SETTING"));
|
||||||
|
assert!(cfg.key_matches("my_app_setting"));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn key_matches_respects_user_exclude() {
|
||||||
|
let cfg = SecretsConfig::from_toml(
|
||||||
|
r#"
|
||||||
|
[keys]
|
||||||
|
exclude = ["*TOKEN*"]
|
||||||
|
"#,
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
// TOKEN patterns were excluded, so GITHUB_TOKEN should no longer match
|
||||||
|
// via the *TOKEN* pattern. But it might match via GITHUB_TOKEN literal.
|
||||||
|
// Let's check something that only matched *TOKEN*.
|
||||||
|
assert!(!cfg.key_matches("MY_TOKEN"));
|
||||||
|
// PASSWORD still matches
|
||||||
|
assert!(cfg.key_matches("MY_PASSWORD"));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn builtin_file_patterns_present() {
|
||||||
|
let cfg = SecretsConfig::default();
|
||||||
|
let patterns = &cfg.files.patterns;
|
||||||
|
assert!(patterns.contains(&".env".to_string()));
|
||||||
|
assert!(patterns.contains(&"*.pem".to_string()));
|
||||||
|
assert!(patterns.contains(&".aws/credentials".to_string()));
|
||||||
|
assert!(patterns.contains(&"terraform.tfvars".to_string()));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn load_missing_files_returns_defaults() {
|
||||||
|
let tmp = tempfile::tempdir().unwrap();
|
||||||
|
let cfg = SecretsConfig::load(tmp.path()).unwrap();
|
||||||
|
assert_eq!(cfg.files.patterns, SecretsConfig::default().files.patterns);
|
||||||
|
assert_eq!(cfg.redaction.style, RedactionStyle::Masked);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn load_project_botsecrets() {
|
||||||
|
let tmp = tempfile::tempdir().unwrap();
|
||||||
|
std::fs::write(
|
||||||
|
tmp.path().join(".botsecrets"),
|
||||||
|
r#"
|
||||||
|
[redaction]
|
||||||
|
style = "named"
|
||||||
|
|
||||||
|
[keys]
|
||||||
|
include = ["CUSTOM_*"]
|
||||||
|
"#,
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
let cfg = SecretsConfig::load(tmp.path()).unwrap();
|
||||||
|
assert_eq!(cfg.redaction.style, RedactionStyle::Named);
|
||||||
|
assert!(cfg.effective_key_includes().contains(&"CUSTOM_*".to_string()));
|
||||||
|
// File patterns remain at defaults (not overridden)
|
||||||
|
assert!(cfg.files.patterns.contains(&".env".to_string()));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn load_local_overrides_project() {
|
||||||
|
let tmp = tempfile::tempdir().unwrap();
|
||||||
|
std::fs::write(
|
||||||
|
tmp.path().join(".botsecrets"),
|
||||||
|
r#"
|
||||||
|
[redaction]
|
||||||
|
style = "named"
|
||||||
|
[enforcement]
|
||||||
|
mode = "strict"
|
||||||
|
"#,
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
std::fs::write(
|
||||||
|
tmp.path().join(".botsecrets.local"),
|
||||||
|
r#"
|
||||||
|
[redaction]
|
||||||
|
style = "absent"
|
||||||
|
"#,
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
let cfg = SecretsConfig::load(tmp.path()).unwrap();
|
||||||
|
// .local overrides .botsecrets for redaction style
|
||||||
|
assert_eq!(cfg.redaction.style, RedactionStyle::Absent);
|
||||||
|
// enforcement from .botsecrets is preserved (not in .local)
|
||||||
|
assert_eq!(cfg.enforcement.mode, EnforcementMode::Strict);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn load_invalid_botsecrets_returns_error() {
|
||||||
|
let tmp = tempfile::tempdir().unwrap();
|
||||||
|
std::fs::write(tmp.path().join(".botsecrets"), "invalid {{ toml").unwrap();
|
||||||
|
let result = SecretsConfig::load(tmp.path());
|
||||||
|
assert!(result.is_err());
|
||||||
|
let err = result.unwrap_err().to_string();
|
||||||
|
assert!(err.contains(".botsecrets"), "error should mention file: {err}");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn merge_keys_accumulate() {
|
||||||
|
let tmp = tempfile::tempdir().unwrap();
|
||||||
|
std::fs::write(
|
||||||
|
tmp.path().join(".botsecrets"),
|
||||||
|
r#"
|
||||||
|
[keys]
|
||||||
|
include = ["FROM_PROJECT"]
|
||||||
|
exclude = ["EXCLUDE_PROJECT"]
|
||||||
|
"#,
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
std::fs::write(
|
||||||
|
tmp.path().join(".botsecrets.local"),
|
||||||
|
r#"
|
||||||
|
[keys]
|
||||||
|
include = ["FROM_LOCAL"]
|
||||||
|
exclude = ["EXCLUDE_LOCAL"]
|
||||||
|
"#,
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
let cfg = SecretsConfig::load(tmp.path()).unwrap();
|
||||||
|
assert!(cfg.keys.include.contains(&"FROM_PROJECT".to_string()));
|
||||||
|
assert!(cfg.keys.include.contains(&"FROM_LOCAL".to_string()));
|
||||||
|
assert!(cfg.keys.exclude.contains(&"EXCLUDE_PROJECT".to_string()));
|
||||||
|
assert!(cfg.keys.exclude.contains(&"EXCLUDE_LOCAL".to_string()));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn merge_file_patterns_replaced_not_appended() {
|
||||||
|
let tmp = tempfile::tempdir().unwrap();
|
||||||
|
std::fs::write(
|
||||||
|
tmp.path().join(".botsecrets"),
|
||||||
|
r#"
|
||||||
|
[files]
|
||||||
|
patterns = ["only-this.env"]
|
||||||
|
"#,
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
let cfg = SecretsConfig::load(tmp.path()).unwrap();
|
||||||
|
assert_eq!(cfg.files.patterns, vec!["only-this.env"]);
|
||||||
|
// Defaults should be gone, replaced by the project's list
|
||||||
|
assert!(!cfg.files.patterns.contains(&".env".to_string()));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn all_redaction_styles_parse() {
|
||||||
|
for (input, expected) in [
|
||||||
|
("masked", RedactionStyle::Masked),
|
||||||
|
("typed", RedactionStyle::Typed),
|
||||||
|
("named", RedactionStyle::Named),
|
||||||
|
("absent", RedactionStyle::Absent),
|
||||||
|
] {
|
||||||
|
let toml_str = format!("[redaction]\nstyle = \"{input}\"");
|
||||||
|
let cfg = SecretsConfig::from_toml(&toml_str).unwrap();
|
||||||
|
assert_eq!(cfg.redaction.style, expected, "failed for: {input}");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn all_enforcement_modes_parse() {
|
||||||
|
for (input, expected) in [
|
||||||
|
("strict", EnforcementMode::Strict),
|
||||||
|
("permissive", EnforcementMode::Permissive),
|
||||||
|
("audit", EnforcementMode::Audit),
|
||||||
|
] {
|
||||||
|
let toml_str = format!("[enforcement]\nmode = \"{input}\"");
|
||||||
|
let cfg = SecretsConfig::from_toml(&toml_str).unwrap();
|
||||||
|
assert_eq!(cfg.enforcement.mode, expected, "failed for: {input}");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn all_heuristic_modes_parse() {
|
||||||
|
for (input, expected) in [
|
||||||
|
("enforce", HeuristicMode::Enforce),
|
||||||
|
("report", HeuristicMode::Report),
|
||||||
|
("disabled", HeuristicMode::Disabled),
|
||||||
|
] {
|
||||||
|
let toml_str = format!("[heuristic]\nmode = \"{input}\"");
|
||||||
|
let cfg = SecretsConfig::from_toml(&toml_str).unwrap();
|
||||||
|
assert_eq!(cfg.heuristic.mode, expected, "failed for: {input}");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn serialization_roundtrip() {
|
||||||
|
let cfg = SecretsConfig::from_toml(
|
||||||
|
r#"
|
||||||
|
[files]
|
||||||
|
patterns = [".env"]
|
||||||
|
[redaction]
|
||||||
|
style = "typed"
|
||||||
|
[enforcement]
|
||||||
|
mode = "audit"
|
||||||
|
on_parse_error = "allow"
|
||||||
|
"#,
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
let serialized = toml::to_string(&cfg).unwrap();
|
||||||
|
let deserialized: SecretsConfig = toml::from_str(&serialized).unwrap();
|
||||||
|
assert_eq!(deserialized.redaction.style, RedactionStyle::Typed);
|
||||||
|
assert_eq!(deserialized.enforcement.mode, EnforcementMode::Audit);
|
||||||
|
assert_eq!(
|
||||||
|
deserialized.enforcement.on_parse_error,
|
||||||
|
ParseErrorAction::Allow
|
||||||
|
);
|
||||||
|
}
|
||||||
@@ -0,0 +1,307 @@
|
|||||||
|
//! Integration tests for `secrets::manifest` — the manifest loader that
|
||||||
|
//! discovers secret files, parses them, and builds the known-secrets set.
|
||||||
|
|
||||||
|
use std::fs;
|
||||||
|
|
||||||
|
use dirigent_fermata::core::secrets::config::SecretsConfig;
|
||||||
|
use dirigent_fermata::core::secrets::manifest::Manifest;
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Helpers
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
/// Create a minimal config that only discovers `.env*` files and matches
|
||||||
|
/// common secret key patterns (the defaults).
|
||||||
|
fn default_config() -> SecretsConfig {
|
||||||
|
SecretsConfig::default()
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Create a config from TOML.
|
||||||
|
fn config_from_toml(toml: &str) -> SecretsConfig {
|
||||||
|
SecretsConfig::from_toml(toml).expect("valid TOML config")
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Tests
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn discovers_env_file_and_extracts_matching_secrets() {
|
||||||
|
let dir = tempfile::tempdir().unwrap();
|
||||||
|
fs::write(
|
||||||
|
dir.path().join(".env"),
|
||||||
|
"DATABASE_URL=postgres://localhost/db\nAPP_NAME=myapp\nSECRET_KEY=super-secret-value-1234\n",
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
let config = default_config();
|
||||||
|
let manifest = Manifest::build(&config, dir.path()).unwrap();
|
||||||
|
|
||||||
|
// DATABASE_URL and SECRET_KEY match the default key patterns; APP_NAME does not.
|
||||||
|
assert!(!manifest.is_empty());
|
||||||
|
|
||||||
|
let keys: Vec<&str> = manifest.entries().iter().map(|e| e.key.as_str()).collect();
|
||||||
|
assert!(keys.contains(&"DATABASE_URL"), "expected DATABASE_URL, got {keys:?}");
|
||||||
|
assert!(keys.contains(&"SECRET_KEY"), "expected SECRET_KEY, got {keys:?}");
|
||||||
|
assert!(!keys.contains(&"APP_NAME"), "APP_NAME should be filtered out");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn discovers_nested_env_local_file() {
|
||||||
|
let dir = tempfile::tempdir().unwrap();
|
||||||
|
let nested = dir.path().join("services").join("auth");
|
||||||
|
fs::create_dir_all(&nested).unwrap();
|
||||||
|
fs::write(
|
||||||
|
nested.join(".env.local"),
|
||||||
|
"AUTH_TOKEN=tok_abcdefgh12345678\n",
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
let config = default_config();
|
||||||
|
let manifest = Manifest::build(&config, dir.path()).unwrap();
|
||||||
|
|
||||||
|
assert!(!manifest.is_empty());
|
||||||
|
let keys: Vec<&str> = manifest.entries().iter().map(|e| e.key.as_str()).collect();
|
||||||
|
assert!(keys.contains(&"AUTH_TOKEN"), "expected AUTH_TOKEN, got {keys:?}");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn filters_entries_by_key_patterns() {
|
||||||
|
let dir = tempfile::tempdir().unwrap();
|
||||||
|
fs::write(
|
||||||
|
dir.path().join(".env"),
|
||||||
|
"MY_PASSWORD=hunter2hunter2\nNOT_SENSITIVE=hello-world-1234\nAPI_KEY=abcdef1234567890\n",
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
let config = default_config();
|
||||||
|
let manifest = Manifest::build(&config, dir.path()).unwrap();
|
||||||
|
|
||||||
|
let keys: Vec<&str> = manifest.entries().iter().map(|e| e.key.as_str()).collect();
|
||||||
|
assert!(keys.contains(&"MY_PASSWORD"));
|
||||||
|
assert!(keys.contains(&"API_KEY"));
|
||||||
|
assert!(!keys.contains(&"NOT_SENSITIVE"));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn file_override_with_explicit_format_and_key_filter() {
|
||||||
|
let dir = tempfile::tempdir().unwrap();
|
||||||
|
// Write a file that wouldn't normally be discovered by default patterns.
|
||||||
|
fs::write(
|
||||||
|
dir.path().join("custom_secrets.conf"),
|
||||||
|
"SERVICE_TOKEN=long-token-value-here\nDEBUG=true-ish-thing\n",
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
let config = config_from_toml(
|
||||||
|
r#"
|
||||||
|
[files]
|
||||||
|
patterns = []
|
||||||
|
|
||||||
|
[[file]]
|
||||||
|
path = "custom_secrets.conf"
|
||||||
|
format = "env"
|
||||||
|
keys = ["SERVICE_TOKEN"]
|
||||||
|
"#,
|
||||||
|
);
|
||||||
|
|
||||||
|
let manifest = Manifest::build(&config, dir.path()).unwrap();
|
||||||
|
|
||||||
|
assert_eq!(manifest.len(), 1);
|
||||||
|
assert_eq!(manifest.entries()[0].key, "SERVICE_TOKEN");
|
||||||
|
assert_eq!(manifest.entries()[0].value, "long-token-value-here");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn empty_project_yields_empty_manifest() {
|
||||||
|
let dir = tempfile::tempdir().unwrap();
|
||||||
|
// No files at all.
|
||||||
|
let config = default_config();
|
||||||
|
let manifest = Manifest::build(&config, dir.path()).unwrap();
|
||||||
|
assert!(manifest.is_empty());
|
||||||
|
assert_eq!(manifest.len(), 0);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn entries_sorted_by_value_length_descending() {
|
||||||
|
let dir = tempfile::tempdir().unwrap();
|
||||||
|
fs::write(
|
||||||
|
dir.path().join(".env"),
|
||||||
|
// Deliberately out of order by length.
|
||||||
|
"TOKEN_A=short1234\nTOKEN_B=a-much-longer-secret-value-here\nTOKEN_C=medium-value1\n",
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
let config = default_config();
|
||||||
|
let manifest = Manifest::build(&config, dir.path()).unwrap();
|
||||||
|
|
||||||
|
let lengths: Vec<usize> = manifest.entries().iter().map(|e| e.value.len()).collect();
|
||||||
|
for window in lengths.windows(2) {
|
||||||
|
assert!(
|
||||||
|
window[0] >= window[1],
|
||||||
|
"entries not sorted by value length descending: {lengths:?}"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn short_values_filtered_out() {
|
||||||
|
let dir = tempfile::tempdir().unwrap();
|
||||||
|
fs::write(
|
||||||
|
dir.path().join(".env"),
|
||||||
|
"PASSWORD_TINY=yes\nPASSWORD_OK=long-enough-password\n",
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
let config = default_config();
|
||||||
|
let manifest = Manifest::build(&config, dir.path()).unwrap();
|
||||||
|
|
||||||
|
let keys: Vec<&str> = manifest.entries().iter().map(|e| e.key.as_str()).collect();
|
||||||
|
// "yes" is 3 chars, below the 4-char minimum.
|
||||||
|
assert!(!keys.contains(&"PASSWORD_TINY"), "short value should be filtered");
|
||||||
|
assert!(keys.contains(&"PASSWORD_OK"));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn deduplication_of_same_key_value() {
|
||||||
|
let dir = tempfile::tempdir().unwrap();
|
||||||
|
|
||||||
|
// Same secret appears in two different .env files.
|
||||||
|
fs::write(
|
||||||
|
dir.path().join(".env"),
|
||||||
|
"SECRET_KEY=shared-secret-value-12345\n",
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
let sub = dir.path().join("sub");
|
||||||
|
fs::create_dir(&sub).unwrap();
|
||||||
|
fs::write(sub.join(".env"), "SECRET_KEY=shared-secret-value-12345\n").unwrap();
|
||||||
|
|
||||||
|
let config = default_config();
|
||||||
|
let manifest = Manifest::build(&config, dir.path()).unwrap();
|
||||||
|
|
||||||
|
// Should be deduplicated to a single entry.
|
||||||
|
let matching: Vec<_> = manifest
|
||||||
|
.entries()
|
||||||
|
.iter()
|
||||||
|
.filter(|e| e.key == "SECRET_KEY")
|
||||||
|
.collect();
|
||||||
|
assert_eq!(
|
||||||
|
matching.len(),
|
||||||
|
1,
|
||||||
|
"duplicate entries should be collapsed: found {}",
|
||||||
|
matching.len()
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn unparseable_file_with_allow_is_skipped() {
|
||||||
|
let dir = tempfile::tempdir().unwrap();
|
||||||
|
// Write a file that looks like an env file but contains garbage TOML.
|
||||||
|
// Actually, .env parser is lenient, so let's use a .toml extension
|
||||||
|
// with invalid TOML content to trigger a parse error.
|
||||||
|
let secrets_dir = dir.path();
|
||||||
|
fs::write(secrets_dir.join("secrets.toml"), "this is not valid toml {{{\n").unwrap();
|
||||||
|
|
||||||
|
// Also write a valid .env so we can confirm it still works.
|
||||||
|
fs::write(
|
||||||
|
secrets_dir.join(".env"),
|
||||||
|
"API_KEY=valid-secret-12345678\n",
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
let config = config_from_toml(
|
||||||
|
r#"
|
||||||
|
[enforcement]
|
||||||
|
on_parse_error = "allow"
|
||||||
|
"#,
|
||||||
|
);
|
||||||
|
|
||||||
|
let manifest = Manifest::build(&config, secrets_dir).unwrap();
|
||||||
|
|
||||||
|
// The broken secrets.toml is skipped; .env is still processed.
|
||||||
|
let keys: Vec<&str> = manifest.entries().iter().map(|e| e.key.as_str()).collect();
|
||||||
|
assert!(keys.contains(&"API_KEY"));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn unparseable_file_with_deny_returns_error() {
|
||||||
|
let dir = tempfile::tempdir().unwrap();
|
||||||
|
fs::write(dir.path().join("secrets.toml"), "not valid toml {{{\n").unwrap();
|
||||||
|
|
||||||
|
let config = config_from_toml(
|
||||||
|
r#"
|
||||||
|
[enforcement]
|
||||||
|
on_parse_error = "deny"
|
||||||
|
"#,
|
||||||
|
);
|
||||||
|
|
||||||
|
let result = Manifest::build(&config, dir.path());
|
||||||
|
assert!(result.is_err(), "deny mode should propagate parse errors");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn manifest_empty_and_is_empty() {
|
||||||
|
let m = Manifest::empty();
|
||||||
|
assert!(m.is_empty());
|
||||||
|
assert_eq!(m.len(), 0);
|
||||||
|
assert!(m.entries().is_empty());
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn skips_git_and_node_modules_directories() {
|
||||||
|
let dir = tempfile::tempdir().unwrap();
|
||||||
|
|
||||||
|
// .env inside .git should be skipped.
|
||||||
|
let git_dir = dir.path().join(".git");
|
||||||
|
fs::create_dir(&git_dir).unwrap();
|
||||||
|
fs::write(git_dir.join(".env"), "SECRET_KEY=git-secret-12345\n").unwrap();
|
||||||
|
|
||||||
|
// .env inside node_modules should be skipped.
|
||||||
|
let nm_dir = dir.path().join("node_modules").join("pkg");
|
||||||
|
fs::create_dir_all(&nm_dir).unwrap();
|
||||||
|
fs::write(nm_dir.join(".env"), "TOKEN=nm-token-12345678\n").unwrap();
|
||||||
|
|
||||||
|
// .env at root should be found.
|
||||||
|
fs::write(
|
||||||
|
dir.path().join(".env"),
|
||||||
|
"API_KEY=root-api-key-12345\n",
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
let config = default_config();
|
||||||
|
let manifest = Manifest::build(&config, dir.path()).unwrap();
|
||||||
|
|
||||||
|
let values: Vec<&str> = manifest.entries().iter().map(|e| e.value.as_str()).collect();
|
||||||
|
assert!(
|
||||||
|
values.contains(&"root-api-key-12345"),
|
||||||
|
"root .env should be found"
|
||||||
|
);
|
||||||
|
assert!(
|
||||||
|
!values.contains(&"git-secret-12345"),
|
||||||
|
".git/.env should be skipped"
|
||||||
|
);
|
||||||
|
assert!(
|
||||||
|
!values.contains(&"nm-token-12345678"),
|
||||||
|
"node_modules/.env should be skipped"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn opaque_file_formats_are_skipped_gracefully() {
|
||||||
|
let dir = tempfile::tempdir().unwrap();
|
||||||
|
// .pem and .key files match default patterns but have no parseable format.
|
||||||
|
fs::write(dir.path().join("server.key"), "binary-ish key data here\n").unwrap();
|
||||||
|
fs::write(
|
||||||
|
dir.path().join(".env"),
|
||||||
|
"PASSWORD=parseable-secret-12345\n",
|
||||||
|
)
|
||||||
|
.unwrap();
|
||||||
|
|
||||||
|
let config = default_config();
|
||||||
|
let manifest = Manifest::build(&config, dir.path()).unwrap();
|
||||||
|
|
||||||
|
// Should not error, should still find the .env entry.
|
||||||
|
let keys: Vec<&str> = manifest.entries().iter().map(|e| e.key.as_str()).collect();
|
||||||
|
assert!(keys.contains(&"PASSWORD"));
|
||||||
|
}
|
||||||
@@ -0,0 +1,404 @@
|
|||||||
|
//! Integration tests for the multi-format secret file parser.
|
||||||
|
|
||||||
|
use dirigent_fermata::core::secrets::parser::{
|
||||||
|
parse_content, parse_secret_file, FileFormat, SecretEntry,
|
||||||
|
};
|
||||||
|
use std::path::Path;
|
||||||
|
use tempfile::NamedTempFile;
|
||||||
|
|
||||||
|
fn p(s: &str) -> &Path {
|
||||||
|
Path::new(s)
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// .env parsing
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn env_basic_key_value() {
|
||||||
|
let entries = parse_content("DATABASE_URL=postgres://localhost/db", FileFormat::Env, p(".env")).unwrap();
|
||||||
|
assert_eq!(entries.len(), 1);
|
||||||
|
assert_eq!(entries[0].key, "DATABASE_URL");
|
||||||
|
assert_eq!(entries[0].value, "postgres://localhost/db");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn env_double_quoted() {
|
||||||
|
let entries = parse_content(r#"SECRET="hello world""#, FileFormat::Env, p(".env")).unwrap();
|
||||||
|
assert_eq!(entries[0].value, "hello world");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn env_single_quoted() {
|
||||||
|
let entries = parse_content("SECRET='hello world'", FileFormat::Env, p(".env")).unwrap();
|
||||||
|
assert_eq!(entries[0].value, "hello world");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn env_comments_and_empty_lines() {
|
||||||
|
let content = "# comment\n\nKEY=value\n # indented comment\n";
|
||||||
|
let entries = parse_content(content, FileFormat::Env, p(".env")).unwrap();
|
||||||
|
assert_eq!(entries.len(), 1);
|
||||||
|
assert_eq!(entries[0].key, "KEY");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn env_export_prefix() {
|
||||||
|
let content = "export API_KEY=abc123\nexport TOKEN=\"xyz\"";
|
||||||
|
let entries = parse_content(content, FileFormat::Env, p(".env")).unwrap();
|
||||||
|
assert_eq!(entries.len(), 2);
|
||||||
|
assert_eq!(entries[0].key, "API_KEY");
|
||||||
|
assert_eq!(entries[0].value, "abc123");
|
||||||
|
assert_eq!(entries[1].key, "TOKEN");
|
||||||
|
assert_eq!(entries[1].value, "xyz");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn env_whitespace_handling() {
|
||||||
|
let content = " KEY = value \nKEY2= spaced ";
|
||||||
|
let entries = parse_content(content, FileFormat::Env, p(".env")).unwrap();
|
||||||
|
// Key is trimmed; unquoted value trimmed.
|
||||||
|
assert_eq!(entries[0].key, "KEY");
|
||||||
|
assert_eq!(entries[0].value, "value");
|
||||||
|
assert_eq!(entries[1].key, "KEY2");
|
||||||
|
assert_eq!(entries[1].value, "spaced");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn env_escape_sequences_in_double_quotes() {
|
||||||
|
let content = r#"MSG="line1\nline2""#;
|
||||||
|
let entries = parse_content(content, FileFormat::Env, p(".env")).unwrap();
|
||||||
|
assert_eq!(entries[0].value, "line1\nline2");
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// TOML parsing
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn toml_flat_table() {
|
||||||
|
let content = r#"
|
||||||
|
API_KEY = "abc"
|
||||||
|
DB_PASS = "secret"
|
||||||
|
"#;
|
||||||
|
let entries = parse_content(content, FileFormat::Toml, p("Secrets.toml")).unwrap();
|
||||||
|
assert_eq!(entries.len(), 2);
|
||||||
|
assert!(entries.iter().any(|e| e.key == "API_KEY" && e.value == "abc"));
|
||||||
|
assert!(entries.iter().any(|e| e.key == "DB_PASS" && e.value == "secret"));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn toml_nested_tables() {
|
||||||
|
let content = r#"
|
||||||
|
[database]
|
||||||
|
password = "secret"
|
||||||
|
host = "localhost"
|
||||||
|
port = 5432
|
||||||
|
"#;
|
||||||
|
let entries = parse_content(content, FileFormat::Toml, p("config.toml")).unwrap();
|
||||||
|
// Only string values extracted; port (integer) skipped.
|
||||||
|
assert_eq!(entries.len(), 2);
|
||||||
|
assert!(entries.iter().any(|e| e.key == "database.password" && e.value == "secret"));
|
||||||
|
assert!(entries.iter().any(|e| e.key == "database.host" && e.value == "localhost"));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn toml_mixed_types_only_strings() {
|
||||||
|
let content = r#"
|
||||||
|
name = "app"
|
||||||
|
debug = true
|
||||||
|
count = 42
|
||||||
|
ratio = 3.14
|
||||||
|
"#;
|
||||||
|
let entries = parse_content(content, FileFormat::Toml, p("app.toml")).unwrap();
|
||||||
|
assert_eq!(entries.len(), 1);
|
||||||
|
assert_eq!(entries[0].key, "name");
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// JSON parsing
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn json_flat_object() {
|
||||||
|
let content = r#"{"api_key": "abc", "secret": "xyz"}"#;
|
||||||
|
let entries = parse_content(content, FileFormat::Json, p("secrets.json")).unwrap();
|
||||||
|
assert_eq!(entries.len(), 2);
|
||||||
|
assert!(entries.iter().any(|e| e.key == "api_key" && e.value == "abc"));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn json_nested_objects() {
|
||||||
|
let content = r#"{"db": {"password": "foo", "port": 5432}}"#;
|
||||||
|
let entries = parse_content(content, FileFormat::Json, p("secrets.json")).unwrap();
|
||||||
|
assert_eq!(entries.len(), 1);
|
||||||
|
assert_eq!(entries[0].key, "db.password");
|
||||||
|
assert_eq!(entries[0].value, "foo");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn json_arrays() {
|
||||||
|
let content = r#"{"keys": ["a", "b"]}"#;
|
||||||
|
let entries = parse_content(content, FileFormat::Json, p("secrets.json")).unwrap();
|
||||||
|
assert_eq!(entries.len(), 2);
|
||||||
|
assert!(entries.iter().any(|e| e.key == "keys.0" && e.value == "a"));
|
||||||
|
assert!(entries.iter().any(|e| e.key == "keys.1" && e.value == "b"));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn json_mixed_types() {
|
||||||
|
let content = r#"{"name": "app", "count": 42, "active": true, "data": null}"#;
|
||||||
|
let entries = parse_content(content, FileFormat::Json, p("a.json")).unwrap();
|
||||||
|
assert_eq!(entries.len(), 1);
|
||||||
|
assert_eq!(entries[0].key, "name");
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// YAML parsing
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn yaml_flat_map() {
|
||||||
|
let content = "api_key: abc\nsecret: xyz\n";
|
||||||
|
let entries = parse_content(content, FileFormat::Yaml, p("secrets.yaml")).unwrap();
|
||||||
|
assert_eq!(entries.len(), 2);
|
||||||
|
assert!(entries.iter().any(|e| e.key == "api_key" && e.value == "abc"));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn yaml_nested_maps() {
|
||||||
|
let content = "db:\n password: foo\n port: 5432\n";
|
||||||
|
let entries = parse_content(content, FileFormat::Yaml, p("secrets.yml")).unwrap();
|
||||||
|
assert_eq!(entries.len(), 1);
|
||||||
|
assert_eq!(entries[0].key, "db.password");
|
||||||
|
assert_eq!(entries[0].value, "foo");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn yaml_mixed_types() {
|
||||||
|
let content = "name: app\ncount: 42\nactive: true\n";
|
||||||
|
let entries = parse_content(content, FileFormat::Yaml, p("a.yaml")).unwrap();
|
||||||
|
assert_eq!(entries.len(), 1);
|
||||||
|
assert_eq!(entries[0].key, "name");
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Python assignment parsing
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn python_matches_assignments() {
|
||||||
|
let content = r#"
|
||||||
|
API_KEY = "abc123"
|
||||||
|
DB_PASS = 'secret'
|
||||||
|
import os
|
||||||
|
x = 42
|
||||||
|
"#;
|
||||||
|
let entries = parse_content(content, FileFormat::PythonAssignments, p("settings.py")).unwrap();
|
||||||
|
assert_eq!(entries.len(), 2);
|
||||||
|
assert!(entries.iter().any(|e| e.key == "API_KEY" && e.value == "abc123"));
|
||||||
|
assert!(entries.iter().any(|e| e.key == "DB_PASS" && e.value == "secret"));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn python_skips_non_matching() {
|
||||||
|
let content = "result = some_function()\nfor x in range(10):\n pass\n";
|
||||||
|
let entries = parse_content(content, FileFormat::PythonAssignments, p("a.py")).unwrap();
|
||||||
|
assert!(entries.is_empty());
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Properties parsing
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn properties_equals_separator() {
|
||||||
|
let content = "db.password=secret\ndb.host=localhost";
|
||||||
|
let entries = parse_content(content, FileFormat::Properties, p("app.properties")).unwrap();
|
||||||
|
assert_eq!(entries.len(), 2);
|
||||||
|
assert!(entries.iter().any(|e| e.key == "db.password" && e.value == "secret"));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn properties_colon_separator() {
|
||||||
|
let content = "db.password: secret";
|
||||||
|
let entries = parse_content(content, FileFormat::Properties, p("app.properties")).unwrap();
|
||||||
|
assert_eq!(entries[0].key, "db.password");
|
||||||
|
assert_eq!(entries[0].value, "secret");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn properties_comments() {
|
||||||
|
let content = "# comment\n! also comment\nkey=value";
|
||||||
|
let entries = parse_content(content, FileFormat::Properties, p("app.properties")).unwrap();
|
||||||
|
assert_eq!(entries.len(), 1);
|
||||||
|
assert_eq!(entries[0].key, "key");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn properties_continuation_lines() {
|
||||||
|
let content = "long.value=hello \\\n world";
|
||||||
|
let entries = parse_content(content, FileFormat::Properties, p("app.properties")).unwrap();
|
||||||
|
assert_eq!(entries.len(), 1);
|
||||||
|
assert_eq!(entries[0].key, "long.value");
|
||||||
|
assert_eq!(entries[0].value, "hello world");
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Auto-detection from file extension
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn format_from_path_env_variants() {
|
||||||
|
assert_eq!(FileFormat::from_path(p(".env")), Some(FileFormat::Env));
|
||||||
|
assert_eq!(FileFormat::from_path(p(".env.local")), Some(FileFormat::Env));
|
||||||
|
assert_eq!(FileFormat::from_path(p(".env.production")), Some(FileFormat::Env));
|
||||||
|
assert_eq!(FileFormat::from_path(p("staging.env")), Some(FileFormat::Env));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn format_from_path_extensions() {
|
||||||
|
assert_eq!(FileFormat::from_path(p("a.toml")), Some(FileFormat::Toml));
|
||||||
|
assert_eq!(FileFormat::from_path(p("a.json")), Some(FileFormat::Json));
|
||||||
|
assert_eq!(FileFormat::from_path(p("a.yaml")), Some(FileFormat::Yaml));
|
||||||
|
assert_eq!(FileFormat::from_path(p("a.yml")), Some(FileFormat::Yaml));
|
||||||
|
assert_eq!(FileFormat::from_path(p("a.py")), Some(FileFormat::PythonAssignments));
|
||||||
|
assert_eq!(FileFormat::from_path(p("a.properties")), Some(FileFormat::Properties));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn format_from_path_unknown() {
|
||||||
|
assert_eq!(FileFormat::from_path(p("a.key")), None);
|
||||||
|
assert_eq!(FileFormat::from_path(p("a.pem")), None);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Format hints
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn format_from_hint() {
|
||||||
|
assert_eq!(FileFormat::from_hint("env"), Some(FileFormat::Env));
|
||||||
|
assert_eq!(FileFormat::from_hint("dotenv"), Some(FileFormat::Env));
|
||||||
|
assert_eq!(FileFormat::from_hint("toml"), Some(FileFormat::Toml));
|
||||||
|
assert_eq!(FileFormat::from_hint("json"), Some(FileFormat::Json));
|
||||||
|
assert_eq!(FileFormat::from_hint("yaml"), Some(FileFormat::Yaml));
|
||||||
|
assert_eq!(FileFormat::from_hint("yml"), Some(FileFormat::Yaml));
|
||||||
|
assert_eq!(FileFormat::from_hint("python-assignments"), Some(FileFormat::PythonAssignments));
|
||||||
|
assert_eq!(FileFormat::from_hint("python"), Some(FileFormat::PythonAssignments));
|
||||||
|
assert_eq!(FileFormat::from_hint("properties"), Some(FileFormat::Properties));
|
||||||
|
assert_eq!(FileFormat::from_hint("java-properties"), Some(FileFormat::Properties));
|
||||||
|
assert_eq!(FileFormat::from_hint("unknown"), None);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Key filtering
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn filter_by_glob() {
|
||||||
|
let content = "API_KEY=abc\nDB_HOST=localhost\nDB_PASSWORD=secret\n";
|
||||||
|
let entries = parse_content(content, FileFormat::Env, p(".env")).unwrap();
|
||||||
|
assert_eq!(entries.len(), 3);
|
||||||
|
|
||||||
|
let filter = vec!["*PASSWORD*".to_string(), "*API_KEY*".to_string()];
|
||||||
|
let result = parse_secret_file_with_filter(content, &filter);
|
||||||
|
assert_eq!(result.len(), 2);
|
||||||
|
assert!(result.iter().any(|e| e.key == "API_KEY"));
|
||||||
|
assert!(result.iter().any(|e| e.key == "DB_PASSWORD"));
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Helper that parses env content with a key filter (avoids temp files).
|
||||||
|
fn parse_secret_file_with_filter(content: &str, filter: &[String]) -> Vec<SecretEntry> {
|
||||||
|
let entries = parse_content(content, FileFormat::Env, p(".env")).unwrap();
|
||||||
|
// Re-implement the filter logic for testing without disk I/O.
|
||||||
|
use dirigent_fermata::core::secrets::parser::parse_content as pc;
|
||||||
|
let all = pc(content, FileFormat::Env, p(".env")).unwrap();
|
||||||
|
// Apply filter manually using the same approach as parse_secret_file.
|
||||||
|
let matchers: Vec<_> = filter
|
||||||
|
.iter()
|
||||||
|
.filter_map(|p| {
|
||||||
|
globset::Glob::new(&p.to_ascii_uppercase())
|
||||||
|
.ok()
|
||||||
|
.map(|g| g.compile_matcher())
|
||||||
|
})
|
||||||
|
.collect();
|
||||||
|
all.into_iter()
|
||||||
|
.filter(|entry| {
|
||||||
|
let upper = entry.key.to_ascii_uppercase();
|
||||||
|
matchers.iter().any(|m| m.is_match(&upper))
|
||||||
|
})
|
||||||
|
.collect()
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Error on unrecognised format
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn error_on_unknown_format() {
|
||||||
|
use std::io::Write;
|
||||||
|
let mut tmp = NamedTempFile::with_suffix(".xyz").unwrap();
|
||||||
|
write!(tmp, "KEY=value").unwrap();
|
||||||
|
let result = parse_secret_file(tmp.path(), None, None);
|
||||||
|
assert!(result.is_err());
|
||||||
|
let err = result.unwrap_err().to_string();
|
||||||
|
assert!(err.contains("cannot determine file format"));
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Empty file
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn empty_file_produces_empty_vec() {
|
||||||
|
let entries = parse_content("", FileFormat::Env, p(".env")).unwrap();
|
||||||
|
assert!(entries.is_empty());
|
||||||
|
|
||||||
|
let entries = parse_content("{}", FileFormat::Json, p("a.json")).unwrap();
|
||||||
|
assert!(entries.is_empty());
|
||||||
|
|
||||||
|
let entries = parse_content("", FileFormat::Toml, p("a.toml")).unwrap();
|
||||||
|
assert!(entries.is_empty());
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// parse_secret_file end-to-end (disk)
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn parse_secret_file_from_disk() {
|
||||||
|
use std::io::Write;
|
||||||
|
let mut tmp = NamedTempFile::with_suffix(".env").unwrap();
|
||||||
|
write!(tmp, "SECRET=hunter2\nPORT=8080").unwrap();
|
||||||
|
|
||||||
|
let entries = parse_secret_file(tmp.path(), None, None).unwrap();
|
||||||
|
assert_eq!(entries.len(), 2);
|
||||||
|
assert!(entries.iter().any(|e| e.key == "SECRET" && e.value == "hunter2"));
|
||||||
|
// Source path should match.
|
||||||
|
assert_eq!(entries[0].source, tmp.path());
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn parse_secret_file_with_key_filter() {
|
||||||
|
use std::io::Write;
|
||||||
|
let mut tmp = NamedTempFile::with_suffix(".env").unwrap();
|
||||||
|
write!(tmp, "API_KEY=abc\nHOST=localhost\nDB_PASSWORD=secret").unwrap();
|
||||||
|
|
||||||
|
let filter = vec!["*PASSWORD*".to_string()];
|
||||||
|
let entries = parse_secret_file(tmp.path(), None, Some(&filter)).unwrap();
|
||||||
|
assert_eq!(entries.len(), 1);
|
||||||
|
assert_eq!(entries[0].key, "DB_PASSWORD");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn parse_secret_file_with_format_override() {
|
||||||
|
use std::io::Write;
|
||||||
|
// Write env content to a .txt file — format override should work.
|
||||||
|
let mut tmp = NamedTempFile::with_suffix(".txt").unwrap();
|
||||||
|
write!(tmp, "KEY=value").unwrap();
|
||||||
|
|
||||||
|
let entries = parse_secret_file(tmp.path(), Some(FileFormat::Env), None).unwrap();
|
||||||
|
assert_eq!(entries.len(), 1);
|
||||||
|
assert_eq!(entries[0].key, "KEY");
|
||||||
|
}
|
||||||
@@ -0,0 +1,373 @@
|
|||||||
|
//! Integration tests for the secret value redactor.
|
||||||
|
|
||||||
|
use std::path::PathBuf;
|
||||||
|
|
||||||
|
use dirigent_fermata::core::secrets::config::RedactionStyle;
|
||||||
|
use dirigent_fermata::core::secrets::manifest::Manifest;
|
||||||
|
use dirigent_fermata::core::secrets::parser::SecretEntry;
|
||||||
|
use dirigent_fermata::core::secrets::redactor::Redactor;
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Helpers
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
fn entry(key: &str, value: &str) -> SecretEntry {
|
||||||
|
SecretEntry {
|
||||||
|
key: key.to_string(),
|
||||||
|
value: value.to_string(),
|
||||||
|
source: PathBuf::from("test"),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
fn make_redactor(entries: Vec<SecretEntry>, style: RedactionStyle) -> Redactor {
|
||||||
|
let manifest = Manifest::from_entries(entries);
|
||||||
|
Redactor::new(&manifest, style)
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Basic redaction
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn basic_single_secret() {
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![entry("DB_PASSWORD", "super_secret_123")],
|
||||||
|
RedactionStyle::Masked,
|
||||||
|
);
|
||||||
|
let result = r.redact("connecting with password super_secret_123 ...");
|
||||||
|
assert_eq!(result.text, "connecting with password ***** ...");
|
||||||
|
assert!(result.was_redacted());
|
||||||
|
assert_eq!(result.redactions.len(), 1);
|
||||||
|
assert_eq!(result.redactions[0].key, "DB_PASSWORD");
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Multiple secrets
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn multiple_different_secrets() {
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![
|
||||||
|
entry("DB_PASSWORD", "db_pass_value"),
|
||||||
|
entry("API_KEY", "ak_12345678"),
|
||||||
|
],
|
||||||
|
RedactionStyle::Masked,
|
||||||
|
);
|
||||||
|
let result = r.redact("db=db_pass_value key=ak_12345678");
|
||||||
|
assert_eq!(result.text, "db=***** key=*****");
|
||||||
|
assert_eq!(result.redactions.len(), 2);
|
||||||
|
assert_eq!(result.redactions[0].key, "DB_PASSWORD");
|
||||||
|
assert_eq!(result.redactions[1].key, "API_KEY");
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Repeated occurrences
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn same_secret_multiple_times() {
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![entry("TOKEN", "tok_abcdef")],
|
||||||
|
RedactionStyle::Named,
|
||||||
|
);
|
||||||
|
let result = r.redact("first=tok_abcdef second=tok_abcdef");
|
||||||
|
assert_eq!(result.text, "first=<REDACTED:TOKEN> second=<REDACTED:TOKEN>");
|
||||||
|
assert_eq!(result.redactions.len(), 2);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Redaction styles
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn style_masked() {
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![entry("KEY", "secret_value")],
|
||||||
|
RedactionStyle::Masked,
|
||||||
|
);
|
||||||
|
let result = r.redact("val=secret_value");
|
||||||
|
assert_eq!(result.text, "val=*****");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn style_typed() {
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![entry("KEY", "secret_value")],
|
||||||
|
RedactionStyle::Typed,
|
||||||
|
);
|
||||||
|
let result = r.redact("val=secret_value");
|
||||||
|
// "secret_value" is 12 chars
|
||||||
|
assert_eq!(result.text, "val=<REDACTED:string:12>");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn style_named() {
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![entry("MY_API_KEY", "secret_value")],
|
||||||
|
RedactionStyle::Named,
|
||||||
|
);
|
||||||
|
let result = r.redact("val=secret_value");
|
||||||
|
assert_eq!(result.text, "val=<REDACTED:MY_API_KEY>");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn style_absent() {
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![entry("KEY", "secret_value")],
|
||||||
|
RedactionStyle::Absent,
|
||||||
|
);
|
||||||
|
let result = r.redact("val=secret_value end");
|
||||||
|
assert_eq!(result.text, "val= end");
|
||||||
|
assert!(result.was_redacted());
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Overlapping values (longest match wins)
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn overlapping_longest_match_wins() {
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![
|
||||||
|
entry("SHORT_KEY", "secret"),
|
||||||
|
entry("LONG_KEY", "secret_long_value"),
|
||||||
|
],
|
||||||
|
RedactionStyle::Named,
|
||||||
|
);
|
||||||
|
let result = r.redact("x=secret_long_value");
|
||||||
|
// The longer value should match, not the shorter substring.
|
||||||
|
assert_eq!(result.text, "x=<REDACTED:LONG_KEY>");
|
||||||
|
assert_eq!(result.redactions.len(), 1);
|
||||||
|
assert_eq!(result.redactions[0].key, "LONG_KEY");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn shorter_match_still_found_when_no_overlap() {
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![
|
||||||
|
entry("SHORT_KEY", "secret"),
|
||||||
|
entry("LONG_KEY", "secret_long_value"),
|
||||||
|
],
|
||||||
|
RedactionStyle::Named,
|
||||||
|
);
|
||||||
|
// "secret" appears standalone (not as part of "secret_long_value")
|
||||||
|
let result = r.redact("a=secret b=secret_long_value");
|
||||||
|
assert_eq!(result.text, "a=<REDACTED:SHORT_KEY> b=<REDACTED:LONG_KEY>");
|
||||||
|
assert_eq!(result.redactions.len(), 2);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// No match
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn no_match_returns_unchanged() {
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![entry("KEY", "not_present_here")],
|
||||||
|
RedactionStyle::Masked,
|
||||||
|
);
|
||||||
|
let result = r.redact("nothing to see here");
|
||||||
|
assert_eq!(result.text, "nothing to see here");
|
||||||
|
assert!(!result.was_redacted());
|
||||||
|
assert!(result.redactions.is_empty());
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Empty text
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn empty_input_returns_empty() {
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![entry("KEY", "some_secret")],
|
||||||
|
RedactionStyle::Masked,
|
||||||
|
);
|
||||||
|
let result = r.redact("");
|
||||||
|
assert_eq!(result.text, "");
|
||||||
|
assert!(!result.was_redacted());
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Empty manifest
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn empty_manifest_returns_unchanged() {
|
||||||
|
let manifest = Manifest::empty();
|
||||||
|
let r = Redactor::new(&manifest, RedactionStyle::Masked);
|
||||||
|
assert!(!r.has_secrets());
|
||||||
|
let result = r.redact("some text with no secrets");
|
||||||
|
assert_eq!(result.text, "some text with no secrets");
|
||||||
|
assert!(!result.was_redacted());
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Short values filtered out by Manifest::from_entries
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn short_values_are_filtered() {
|
||||||
|
// Values shorter than 4 chars should be dropped by from_entries.
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![entry("TINY", "abc"), entry("LONG_ENOUGH", "abcd")],
|
||||||
|
RedactionStyle::Masked,
|
||||||
|
);
|
||||||
|
let result = r.redact("abc abcd");
|
||||||
|
// "abc" should NOT be redacted (too short), "abcd" should be.
|
||||||
|
assert_eq!(result.text, "abc *****");
|
||||||
|
assert_eq!(result.redactions.len(), 1);
|
||||||
|
assert_eq!(result.redactions[0].key, "LONG_ENOUGH");
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Zero false negatives — every declared secret must be caught
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn zero_false_negatives() {
|
||||||
|
let secrets = vec![
|
||||||
|
entry("A_SECRET", "alpha_secret_val"),
|
||||||
|
entry("B_TOKEN", "bravo_token_val_"),
|
||||||
|
entry("C_PASSWORD", "charlie_pass_99"),
|
||||||
|
entry("D_API_KEY", "delta_key_00000"),
|
||||||
|
];
|
||||||
|
let r = make_redactor(secrets.clone(), RedactionStyle::Masked);
|
||||||
|
|
||||||
|
// Build text that contains every single secret value.
|
||||||
|
let text = format!(
|
||||||
|
"a={} b={} c={} d={}",
|
||||||
|
"alpha_secret_val", "bravo_token_val_", "charlie_pass_99", "delta_key_00000",
|
||||||
|
);
|
||||||
|
let result = r.redact(&text);
|
||||||
|
|
||||||
|
// Every secret value must be replaced.
|
||||||
|
for s in &secrets {
|
||||||
|
if s.value.len() >= 4 {
|
||||||
|
assert!(
|
||||||
|
!result.text.contains(&s.value),
|
||||||
|
"Secret {} was not redacted: {}",
|
||||||
|
s.key,
|
||||||
|
result.text,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
assert_eq!(result.redactions.len(), 4);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Multi-line text
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn multi_line_redaction() {
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![
|
||||||
|
entry("DB_PASSWORD", "s3cr3t_p@ss"),
|
||||||
|
entry("API_KEY", "ak-1234567890"),
|
||||||
|
],
|
||||||
|
RedactionStyle::Masked,
|
||||||
|
);
|
||||||
|
let text = "# Config file\n\
|
||||||
|
DATABASE_URL=postgres://user:s3cr3t_p@ss@host/db\n\
|
||||||
|
API_KEY=ak-1234567890\n\
|
||||||
|
OTHER=safe_value\n";
|
||||||
|
let result = r.redact(text);
|
||||||
|
assert!(!result.text.contains("s3cr3t_p@ss"));
|
||||||
|
assert!(!result.text.contains("ak-1234567890"));
|
||||||
|
assert!(result.text.contains("safe_value"));
|
||||||
|
assert_eq!(result.redactions.len(), 2);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Redaction metadata correctness
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn redaction_metadata_offset_and_len() {
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![entry("SECRET", "ABCDEFGH")],
|
||||||
|
RedactionStyle::Masked,
|
||||||
|
);
|
||||||
|
let text = "prefix_ABCDEFGH_suffix";
|
||||||
|
let result = r.redact(text);
|
||||||
|
|
||||||
|
assert_eq!(result.redactions.len(), 1);
|
||||||
|
let red = &result.redactions[0];
|
||||||
|
assert_eq!(red.key, "SECRET");
|
||||||
|
assert_eq!(red.offset, 7); // "prefix_" is 7 bytes
|
||||||
|
assert_eq!(red.original_len, 8); // "ABCDEFGH" is 8 bytes
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn redaction_metadata_multiple_offsets() {
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![entry("TOK", "xxxx1234")],
|
||||||
|
RedactionStyle::Masked,
|
||||||
|
);
|
||||||
|
// "a=xxxx1234 b=xxxx1234"
|
||||||
|
let text = "a=xxxx1234 b=xxxx1234";
|
||||||
|
let result = r.redact(text);
|
||||||
|
|
||||||
|
assert_eq!(result.redactions.len(), 2);
|
||||||
|
assert_eq!(result.redactions[0].offset, 2); // after "a="
|
||||||
|
assert_eq!(result.redactions[0].original_len, 8);
|
||||||
|
assert_eq!(result.redactions[1].offset, 13); // after " b="
|
||||||
|
assert_eq!(result.redactions[1].original_len, 8);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// has_secrets() helper
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn has_secrets_with_entries() {
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![entry("KEY", "long_enough_value")],
|
||||||
|
RedactionStyle::Masked,
|
||||||
|
);
|
||||||
|
assert!(r.has_secrets());
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn has_secrets_empty() {
|
||||||
|
let r = make_redactor(vec![], RedactionStyle::Masked);
|
||||||
|
assert!(!r.has_secrets());
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// was_redacted() helper
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn was_redacted_true_when_match() {
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![entry("KEY", "findme_value")],
|
||||||
|
RedactionStyle::Masked,
|
||||||
|
);
|
||||||
|
let result = r.redact("findme_value");
|
||||||
|
assert!(result.was_redacted());
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn was_redacted_false_when_no_match() {
|
||||||
|
let r = make_redactor(
|
||||||
|
vec![entry("KEY", "findme_value")],
|
||||||
|
RedactionStyle::Masked,
|
||||||
|
);
|
||||||
|
let result = r.redact("nothing here");
|
||||||
|
assert!(!result.was_redacted());
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Deduplication in from_entries
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn duplicate_entries_deduplicated() {
|
||||||
|
let manifest = Manifest::from_entries(vec![
|
||||||
|
entry("KEY", "same_value_here"),
|
||||||
|
entry("KEY", "same_value_here"),
|
||||||
|
]);
|
||||||
|
assert_eq!(manifest.len(), 1);
|
||||||
|
}
|
||||||
@@ -0,0 +1,254 @@
|
|||||||
|
use dirigent_fermata::core::secrets::config::HeuristicConfig;
|
||||||
|
use dirigent_fermata::core::secrets::scanner::{shannon_entropy, Confidence, Scanner};
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Helper: build a scanner with default config (built-in rules only)
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
fn default_scanner() -> Scanner {
|
||||||
|
Scanner::builtin().expect("built-in rules must compile")
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Specific provider patterns
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn detects_aws_access_key() {
|
||||||
|
let scanner = default_scanner();
|
||||||
|
let findings = scanner.scan("here is my key: AKIAIOSFODNN7EXAMPLE ok");
|
||||||
|
assert!(
|
||||||
|
findings.iter().any(|f| f.pattern_id == "aws-access-key"),
|
||||||
|
"expected aws-access-key finding, got: {findings:?}"
|
||||||
|
);
|
||||||
|
assert_eq!(findings[0].confidence, Confidence::High);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn detects_github_pat_classic() {
|
||||||
|
let scanner = default_scanner();
|
||||||
|
let findings = scanner.scan("ghp_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghij");
|
||||||
|
assert!(
|
||||||
|
findings.iter().any(|f| f.pattern_id == "github-pat-classic"),
|
||||||
|
"expected github-pat-classic finding, got: {findings:?}"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn detects_stripe_secret_key() {
|
||||||
|
let scanner = default_scanner();
|
||||||
|
let findings = scanner.scan("STRIPE_KEY=sk_live_abcdefghijklmnopqrstuvwx");
|
||||||
|
assert!(
|
||||||
|
findings.iter().any(|f| f.pattern_id == "stripe-secret-key"),
|
||||||
|
"expected stripe-secret-key finding, got: {findings:?}"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn detects_private_key_header() {
|
||||||
|
let scanner = default_scanner();
|
||||||
|
let text = "-----BEGIN RSA PRIVATE KEY-----\nMIIEpAIBAAK...\n-----END RSA PRIVATE KEY-----";
|
||||||
|
let findings = scanner.scan(text);
|
||||||
|
assert!(
|
||||||
|
findings
|
||||||
|
.iter()
|
||||||
|
.any(|f| f.pattern_id == "private-key-header"),
|
||||||
|
"expected private-key-header finding, got: {findings:?}"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn detects_jwt_token() {
|
||||||
|
let scanner = default_scanner();
|
||||||
|
// A realistic-looking (but fake) JWT.
|
||||||
|
let jwt = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ik\
|
||||||
|
pvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c";
|
||||||
|
let findings = scanner.scan(jwt);
|
||||||
|
assert!(
|
||||||
|
findings.iter().any(|f| f.pattern_id == "jwt"),
|
||||||
|
"expected jwt finding, got: {findings:?}"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn detects_database_connection_url() {
|
||||||
|
let scanner = default_scanner();
|
||||||
|
let findings = scanner.scan("DATABASE_URL=postgres://admin:s3cretP4ss@db.example.com:5432/mydb");
|
||||||
|
assert!(
|
||||||
|
findings
|
||||||
|
.iter()
|
||||||
|
.any(|f| f.pattern_id == "database-connection-url"),
|
||||||
|
"expected database-connection-url finding, got: {findings:?}"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn detects_slack_webhook() {
|
||||||
|
let scanner = default_scanner();
|
||||||
|
let findings = scanner
|
||||||
|
.scan("https://hooks.slack.com/services/T0ABCDEFG/B0ABCDEFG/abcdefghijklmnopqrstuvwx");
|
||||||
|
assert!(
|
||||||
|
findings.iter().any(|f| f.pattern_id == "slack-webhook"),
|
||||||
|
"expected slack-webhook finding, got: {findings:?}"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn detects_anthropic_api_key() {
|
||||||
|
let scanner = default_scanner();
|
||||||
|
let key = "sk-ant-aBcDeFgHiJkLmNoPqRsTuVwXyZ0123456789abcdefgh";
|
||||||
|
let findings = scanner.scan(&format!("my key is {key}"));
|
||||||
|
assert!(
|
||||||
|
findings
|
||||||
|
.iter()
|
||||||
|
.any(|f| f.pattern_id == "anthropic-api-key"),
|
||||||
|
"expected anthropic-api-key finding, got: {findings:?}"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn detects_sendgrid_api_key() {
|
||||||
|
let scanner = default_scanner();
|
||||||
|
let key = "SG.abcdefghijklmnopqrstuv.ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrst";
|
||||||
|
let findings = scanner.scan(key);
|
||||||
|
assert!(
|
||||||
|
findings.iter().any(|f| f.pattern_id == "sendgrid-api-key"),
|
||||||
|
"expected sendgrid-api-key finding, got: {findings:?}"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Generic patterns — entropy filtering
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn rejects_low_entropy_generic_api_key() {
|
||||||
|
let scanner = default_scanner();
|
||||||
|
// "test" repeated has very low entropy — should NOT trigger.
|
||||||
|
let findings = scanner.scan(r#"api_key = "testtesttesttesttest""#);
|
||||||
|
let generic_hits: Vec<_> = findings
|
||||||
|
.iter()
|
||||||
|
.filter(|f| f.pattern_id == "generic-api-key")
|
||||||
|
.collect();
|
||||||
|
assert!(
|
||||||
|
generic_hits.is_empty(),
|
||||||
|
"low-entropy api_key should be filtered out, got: {generic_hits:?}"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn accepts_high_entropy_generic_secret() {
|
||||||
|
let scanner = default_scanner();
|
||||||
|
// A high-entropy random-looking value.
|
||||||
|
let findings = scanner.scan(r#"secret = "a8Kz3Lm9Xq2Wp7Yn"#);
|
||||||
|
let has_generic = findings
|
||||||
|
.iter()
|
||||||
|
.any(|f| f.pattern_id == "generic-secret");
|
||||||
|
assert!(
|
||||||
|
has_generic,
|
||||||
|
"high-entropy secret should be detected, got: {findings:?}"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Custom patterns from config
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn custom_pattern_from_config() {
|
||||||
|
let config = HeuristicConfig {
|
||||||
|
enabled: true,
|
||||||
|
patterns: vec![r"MY_CUSTOM_[A-Z]{10}".to_string()],
|
||||||
|
..Default::default()
|
||||||
|
};
|
||||||
|
let scanner = Scanner::new(&config).expect("should compile custom pattern");
|
||||||
|
let findings = scanner.scan("found MY_CUSTOM_ABCDEFGHIJ in output");
|
||||||
|
assert!(
|
||||||
|
findings.iter().any(|f| f.pattern_id == "custom-0"),
|
||||||
|
"expected custom-0 finding, got: {findings:?}"
|
||||||
|
);
|
||||||
|
assert_eq!(findings[0].confidence, Confidence::High);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Edge cases
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn empty_text_returns_no_findings() {
|
||||||
|
let scanner = default_scanner();
|
||||||
|
assert!(scanner.scan("").is_empty());
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn plain_text_returns_no_findings() {
|
||||||
|
let scanner = default_scanner();
|
||||||
|
let findings = scanner.scan("This is just a normal paragraph with no secrets.");
|
||||||
|
assert!(
|
||||||
|
findings.is_empty(),
|
||||||
|
"plain text should have no findings, got: {findings:?}"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn overlapping_matches_are_deduplicated() {
|
||||||
|
// Construct text where the same span could match multiple patterns.
|
||||||
|
// The bearer token pattern and a generic pattern could overlap on the same region.
|
||||||
|
let scanner = default_scanner();
|
||||||
|
let text = "Authorization: Bearer ghp_ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefgh";
|
||||||
|
let findings = scanner.scan(text);
|
||||||
|
|
||||||
|
// Verify no two findings have overlapping spans.
|
||||||
|
for i in 0..findings.len() {
|
||||||
|
for j in (i + 1)..findings.len() {
|
||||||
|
assert!(
|
||||||
|
findings[j].span.start >= findings[i].span.end,
|
||||||
|
"findings {i} and {j} overlap: {:?} vs {:?}",
|
||||||
|
findings[i].span,
|
||||||
|
findings[j].span,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Shannon entropy unit tests (supplement the inline mod tests)
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn entropy_known_values() {
|
||||||
|
// Single character repeated → 0.
|
||||||
|
assert!((shannon_entropy("aaaa") - 0.0).abs() < f64::EPSILON);
|
||||||
|
|
||||||
|
// Perfectly balanced binary → 1.0 bits/char.
|
||||||
|
let balanced = "ababababab";
|
||||||
|
assert!((shannon_entropy(balanced) - 1.0).abs() < 0.01);
|
||||||
|
|
||||||
|
// High diversity.
|
||||||
|
let diverse = "aB3$kL9!mZ7@wQ1#xR5^";
|
||||||
|
assert!(shannon_entropy(diverse) > 3.5);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// Scanner construction
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn builtin_scanner_has_rules() {
|
||||||
|
let scanner = default_scanner();
|
||||||
|
assert!(
|
||||||
|
scanner.rule_count() >= 30,
|
||||||
|
"expected at least 30 built-in rules, got {}",
|
||||||
|
scanner.rule_count()
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn invalid_custom_pattern_returns_error() {
|
||||||
|
let config = HeuristicConfig {
|
||||||
|
enabled: true,
|
||||||
|
patterns: vec![r"[invalid".to_string()],
|
||||||
|
..Default::default()
|
||||||
|
};
|
||||||
|
assert!(Scanner::new(&config).is_err());
|
||||||
|
}
|
||||||
+58
-1
@@ -1,5 +1,5 @@
|
|||||||
use dirigent_fermata::core::{Decision, Reason};
|
use dirigent_fermata::core::{Decision, Reason};
|
||||||
use dirigent_fermata::harness::{HarnessAdapter, PathKind, ToolOp};
|
use dirigent_fermata::harness::{HarnessAdapter, HookEvent, PathKind, ToolOp};
|
||||||
use dirigent_fermata::harness::claude::ClaudeAdapter;
|
use dirigent_fermata::harness::claude::ClaudeAdapter;
|
||||||
|
|
||||||
#[test]
|
#[test]
|
||||||
@@ -84,3 +84,60 @@ fn renders_ask_as_ask() {
|
|||||||
let v: serde_json::Value = serde_json::from_slice(&out).unwrap();
|
let v: serde_json::Value = serde_json::from_slice(&out).unwrap();
|
||||||
assert_eq!(v["hookSpecificOutput"]["permissionDecision"], "ask");
|
assert_eq!(v["hookSpecificOutput"]["permissionDecision"], "ask");
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// PostToolUse
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn parses_post_tool_use_payload() {
|
||||||
|
let payload = br#"{"tool_name":"Read","tool_input":{"file_path":"/proj/.env"},"tool_response":"SECRET=abc"}"#;
|
||||||
|
let p = ClaudeAdapter.parse_post_tool_use(payload).unwrap();
|
||||||
|
assert_eq!(p.tool_name, "Read");
|
||||||
|
assert_eq!(p.tool_response, "SECRET=abc");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn parses_post_tool_use_missing_response() {
|
||||||
|
// tool_response absent → defaults to empty string.
|
||||||
|
let payload = br#"{"tool_name":"Bash","tool_input":{"command":"ls"}}"#;
|
||||||
|
let p = ClaudeAdapter.parse_post_tool_use(payload).unwrap();
|
||||||
|
assert_eq!(p.tool_response, "");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn renders_post_tool_use_with_redacted_output() {
|
||||||
|
let payload = br#"{"tool_name":"Read","tool_input":{},"tool_response":"x"}"#;
|
||||||
|
let p = ClaudeAdapter.parse_post_tool_use(payload).unwrap();
|
||||||
|
let out = ClaudeAdapter
|
||||||
|
.render_post_tool_use(&p, Some("redacted text"))
|
||||||
|
.unwrap();
|
||||||
|
let v: serde_json::Value = serde_json::from_slice(&out).unwrap();
|
||||||
|
assert_eq!(v["hookSpecificOutput"]["hookEventName"], "PostToolUse");
|
||||||
|
assert_eq!(
|
||||||
|
v["hookSpecificOutput"]["updatedToolOutput"],
|
||||||
|
"redacted text"
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn renders_post_tool_use_passthrough() {
|
||||||
|
let payload = br#"{"tool_name":"Read","tool_input":{},"tool_response":"clean"}"#;
|
||||||
|
let p = ClaudeAdapter.parse_post_tool_use(payload).unwrap();
|
||||||
|
let out = ClaudeAdapter.render_post_tool_use(&p, None).unwrap();
|
||||||
|
let v: serde_json::Value = serde_json::from_slice(&out).unwrap();
|
||||||
|
assert_eq!(v, serde_json::json!({}));
|
||||||
|
}
|
||||||
|
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
// HookEvent parsing
|
||||||
|
// ---------------------------------------------------------------------------
|
||||||
|
|
||||||
|
#[test]
|
||||||
|
fn hook_event_parse_variants() {
|
||||||
|
assert_eq!(HookEvent::parse("pre-tool-use"), Some(HookEvent::PreToolUse));
|
||||||
|
assert_eq!(HookEvent::parse("PreToolUse"), Some(HookEvent::PreToolUse));
|
||||||
|
assert_eq!(HookEvent::parse("post-tool-use"), Some(HookEvent::PostToolUse));
|
||||||
|
assert_eq!(HookEvent::parse("PostToolUse"), Some(HookEvent::PostToolUse));
|
||||||
|
assert_eq!(HookEvent::parse("unknown"), None);
|
||||||
|
}
|
||||||
|
|||||||
Reference in New Issue
Block a user