Files
dirigent_anth/CLAUDE.md
T
g4borg ed8bc3e5fd rename dirigent_ant to dirigent_anth, binaries to anth_bear/anth_usage
Rename the Claude Code session parser crate from dirigent_ant to
dirigent_anth. Binary targets renamed: ant → anth_bear, ant_usage →
anth_usage. Module claude_usage renamed to anth_usage throughout.
Also normalizes CRLF → LF line endings across touched files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-05-07 21:59:24 +02:00

5.8 KiB

Package: dirigent_anth

Claude Code JSONL session parser and toolkit.

Quick Facts

  • Type: Library
  • Main Entry: src/lib.rs
  • Dependencies: serde, serde_json, chrono, uuid, camino, thiserror, tracing, dirs
  • Status: Core parsing complete — ready for downstream consumers

Purpose

Reads Claude Code's local JSONL session storage (~/.claude/projects/) and produces typed, deduplicated, correlated Rust data structures. The types are the product — downstream consumers (archivist import, shell usage analyzers, session browsers) depend on these structs.

Key Features

  • Session Discovery: Scan ~/.claude/projects/ for all Claude Code projects and sessions
  • JSONL Parsing: Lenient line-by-line parser that handles unknown fields and message types
  • Streaming Dedup: Collapse streamed assistant messages to their final version
  • Tool Correlation: ID-based pairing of tool_use → tool_result across parallel calls
  • Conversation Tree: Reconstruct uuid/parentUuid threading with branch detection
  • Noise Classification: Identify meta messages, warmup, interruptions, API errors
  • Sub-Agent Loading: Recursive parsing of sub-agent JSONL with metadata
  • Timestamp Parsing: Handle ISO 8601, Unix seconds, and Unix milliseconds

Architecture

Design Principles

  1. Types are the product — Well-typed Rust structs that downstream consumers import
  2. Lenient parsing — Unknown fields ignored, unknown message types logged and skipped
  3. Stream-oriented — Line-by-line BufReader parsing, never loads entire files
  4. Sync-first — File parsing is CPU-bound; no async overhead
  5. Cross-platform — camino::Utf8PathBuf throughout for Windows/Unix compatibility

Module Organization

  • types.rs — All public data types (Content, ContentBlock, RawMessage variants, ToolCall, etc.)
  • error.rs — AntError enum with I/O, JSON parse, home-not-found, invalid-path variants
  • parser.rs — JSONL line parser and file parser with lenient error handling
  • dedup.rs — Streaming deduplication of assistant messages by uuid
  • correlation.rs — Tool call ↔ result pairing by tool_use_id
  • tree.rs — Conversation tree from uuid/parentUuid relationships
  • noise.rs — Noise pattern classification (meta, warmup, interruptions, etc.)
  • discovery.rs — Filesystem scanning for Claude projects and sessions
  • subagent.rs — Sub-agent JSONL and metadata loading
  • util.rs — Timestamp parsing utilities

Public API

Quick Start

use dirigent_anth::{discover_claude_home, discover_projects, load_session};

// Discover all projects
let home = discover_claude_home()?;
let projects = discover_projects(&home)?;

// Load a session with full parsing
for project in &projects {
    for session_ref in &project.sessions {
        let session = load_session(session_ref)?;
        println!("Messages: {}, Tools: {}, Subagents: {}",
            session.messages.len(),
            session.tool_exchanges.len(),
            session.subagents.len());
    }
}

Key Functions

Function Purpose
discover_claude_home() Find ~/.claude/ directory
discover_projects(home) Scan for all project directories
parse_session(path) Parse a JSONL file into messages
parse_session_deduped(path) Parse with streaming dedup applied
dedup_messages(msgs) Deduplicate streamed assistant messages
correlate_tools(msgs) Pair tool calls with results by ID
ConversationTree::build(msgs) Build conversation tree
classify_noise(msg) Classify a message as noise
load_subagents(dir) Load sub-agent sessions from artifacts
load_session(ref) Full parse: dedup + correlate + tree + subagents
parse_timestamp(value) Parse ISO/Unix timestamps

Data Model

Claude Code JSONL Format

Each line in ~/.claude/projects/<encoded-path>/<session-uuid>.jsonl is a JSON object with a type field discriminator. Five types: user, assistant, progress, system, queue-operation.

  • Outer wrapper: camelCase fields (sessionId, parentUuid, isSidechain, gitBranch)
  • Inner message body: snake_case fields (stop_reason, tool_use_id, is_error)
  • Content: Either a plain string or array of typed content blocks

Content Blocks

Type Fields
text text
tool_use id, name, input
tool_result tool_use_id, content, is_error
thinking thinking
image source

Unknown content block types are silently dropped (lenient deserialization).

Testing

cargo test --package dirigent_anth

Tests use synthetic JSONL fixtures in tests/fixtures/:

  • minimal_session.jsonl — Basic session with all message types
  • streaming_dedup.jsonl — Streaming dedup scenario
  • tool_correlation.jsonl — Parallel and sequential tool calls
  • branching_tree.jsonl — Conversation with branches
  • noise_patterns.jsonl — All noise pattern types
  • subagent/ — Sub-agent session with parent and metadata

Error Handling

  • Individual unparseable JSONL lines are logged and skipped (lenient)
  • I/O errors and missing directories are propagated as AntError
  • Unknown message types are skipped via serde
  • Unknown content blocks are silently filtered
  • dirigent_archivist — Future consumer for session import
  • No current dependencies on other dirigent packages (standalone)

Future Enhancements

  • Bash command analysis module (shell usage analytics)
  • Archivist event transform/import
  • CLI tool with scan/analyze/import subcommands
  • SQLite caching layer
  • Watch mode for new session monitoring

Documentation

  • Package README: ./README.md - User-facing overview
  • API Docs: Run cargo doc --package dirigent_anth --open
  • Design Plan: docs/superpowers/plans/2026-03-23-dirigent-ant-design.md