dirigence/dirigent

Fork 0

Files

T

g4borg b03dc15371 sync from monorepo @ 2452e92e

2026-05-08 01:59:04 +02:00

13 KiB

Raw Blame History

Dirigent Protocol Streaming Model

Overview

The Dirigent Protocol uses an ACP-style streaming model built around SessionUpdate events. This model provides granular, real-time updates during agent interactions, enabling responsive UIs and structured content representation.

Version: 0.2.0

Core Concepts

SessionUpdate Events

All streaming content is delivered through SessionUpdate variants wrapped in Event::SessionUpdate:

pub enum Event {
    // ... other events
    SessionUpdate {
        session_id: String,
        update: SessionUpdate,
    },
}

The SessionUpdate enum contains five variants for different types of streaming updates:

pub enum SessionUpdate {
    UserMessageChunk { message_id: String, content: ContentBlock, _meta: Option<Meta> },
    AgentMessageChunk { message_id: String, content: ContentBlock, _meta: Option<Meta> },
    AgentThoughtChunk { message_id: String, content: ContentBlock, _meta: Option<Meta> },
    ToolCall { message_id: String, tool_call: ToolCall, _meta: Option<Meta> },
    ToolCallUpdate { message_id: String, tool_call_id: String, tool_call: ToolCall, _meta: Option<Meta> },
}

ContentBlock Types

Content is represented using structured ContentBlock variants:

pub enum ContentBlock {
    Text { text: String },
    ResourceLink {
        uri: String,
        name: Option<String>,
        mime_type: Option<String>,
    },
    // Future: Resource, Image, Audio (marked as out-of-scope for phase 1)
}

Key Points:

Text is the primary content type for all textual output
ResourceLink represents file references without embedding full content
Future expansions will include embedded resources, images, and audio

Provider Metadata (_meta)

All SessionUpdate variants support optional _meta fields for provider-specific information:

pub struct Meta {
    pub provider: Option<ProviderMeta>,
    pub extra: HashMap<String, Value>,  // Arbitrary additional fields
}

pub struct ProviderMeta {
    pub name: String,                                       // e.g., "opencode", "anthropic"
    pub original_ids: Option<HashMap<String, String>>,      // Original provider IDs
    pub raw_excerpt: Option<Value>,                         // Minimal raw payload for debugging
}

Usage:

Adapters populate _meta to preserve provider-specific information
Consumers can use this for debugging, telemetry, or provider-specific features
The extra map allows arbitrary fields for forward compatibility

SessionUpdate Variants

1. UserMessageChunk

Represents streaming chunks of user message content.

SessionUpdate::UserMessageChunk {
    message_id: "msg_abc123".to_string(),
    content: ContentBlock::Text {
        text: "What's the capital of France?".to_string(),
    },
    _meta: None,
}

When to use:

Streaming user input being typed
Echo of user input from server
Multi-part user messages being assembled

2. AgentMessageChunk

Represents streaming chunks of agent response content.

SessionUpdate::AgentMessageChunk {
    message_id: "msg_def456".to_string(),
    content: ContentBlock::Text {
        text: "The capital of France is ".to_string(),
    },
    _meta: Some(Meta {
        provider: Some(ProviderMeta {
            name: "opencode".to_string(),
            original_ids: Some(HashMap::from([
                ("message_id".to_string(), "original_123".to_string()),
            ])),
            raw_excerpt: None,
        }),
        extra: HashMap::new(),
    }),
}

When to use:

Agent's response text being generated
Final answer content
Any visible agent output

Key distinction from AgentThoughtChunk:

AgentMessageChunk: Visible output intended for the user
AgentThoughtChunk: Internal reasoning, typically hidden or collapsible

3. AgentThoughtChunk

Represents streaming chunks of agent internal reasoning (thinking, planning).

SessionUpdate::AgentThoughtChunk {
    message_id: "msg_ghi789".to_string(),
    content: ContentBlock::Text {
        text: "I need to look up Paris in my knowledge base...".to_string(),
    },
    _meta: None,
}

When to use:

Agent's internal reasoning process
"Chain of thought" content
Planning or decision-making process
Content typically displayed in collapsible sections

UI Conventions:

Often hidden by default or shown in a separate "Thinking" section
May be styled differently (e.g., italics, muted colors)
Can be collapsed to save screen space

4. ToolCall

Represents the initiation or current state of a tool call.

SessionUpdate::ToolCall {
    message_id: "msg_jkl012".to_string(),
    tool_call: ToolCall {
        id: "call_xyz789".to_string(),
        tool_name: "read_file".to_string(),
        status: ToolCallStatus::Pending,
        content: vec![],
        raw_input: Some(json!({
            "file_path": "/path/to/file.txt"
        })),
        raw_output: None,
        title: Some("Read file.txt".to_string()),
        error: None,
        metadata: None,
    },
    _meta: None,
}

When to use:

Tool call is first initiated
Sending a snapshot of current tool state
Re-sending full tool state after reconnection

ToolCallStatus Lifecycle:

Pending → Tool call created but not yet executing
Running → Tool call actively executing
Completed → Tool call finished successfully
Error → Tool call failed

5. ToolCallUpdate

Represents an update to an existing tool call (status change, new content).

SessionUpdate::ToolCallUpdate {
    message_id: "msg_jkl012".to_string(),
    tool_call_id: "call_xyz789".to_string(),
    tool_call: ToolCall {
        id: "call_xyz789".to_string(),
        tool_name: "read_file".to_string(),
        status: ToolCallStatus::Running,
        content: vec![
            ContentBlock::Text {
                text: "Reading file...".to_string(),
            },
        ],
        raw_input: Some(json!({
            "file_path": "/path/to/file.txt"
        })),
        raw_output: None,
        title: Some("Read file.txt".to_string()),
        error: None,
        metadata: Some(json!({
            "bytes_read": 1024
        })),
    },
    _meta: None,
}

When to use:

Tool status changes (Pending → Running → Completed/Error)
New output content available
Progress updates
Error state reached

Note: The full ToolCall is sent each time, not a delta. Consumers should replace the previous tool call state with the new one.

Tool Call Lifecycle

Understanding the tool call lifecycle is essential for proper UI implementation:

// 1. Tool call initiated
SessionUpdate::ToolCall {
    tool_call: ToolCall {
        id: "call_123",
        status: ToolCallStatus::Pending,
        content: vec![],
        // ...
    }
}

// 2. Tool starts executing
SessionUpdate::ToolCallUpdate {
    tool_call_id: "call_123",
    tool_call: ToolCall {
        id: "call_123",
        status: ToolCallStatus::Running,
        content: vec![],
        // ...
    }
}

// 3. Tool produces output
SessionUpdate::ToolCallUpdate {
    tool_call_id: "call_123",
    tool_call: ToolCall {
        id: "call_123",
        status: ToolCallStatus::Running,
        content: vec![
            ContentBlock::Text { text: "Output line 1" },
        ],
        // ...
    }
}

// 4a. Tool completes successfully
SessionUpdate::ToolCallUpdate {
    tool_call_id: "call_123",
    tool_call: ToolCall {
        id: "call_123",
        status: ToolCallStatus::Completed,
        content: vec![
            ContentBlock::Text { text: "Output line 1" },
            ContentBlock::Text { text: "Done!" },
        ],
        raw_output: Some(json!({"success": true})),
        // ...
    }
}

// 4b. Or tool fails with error
SessionUpdate::ToolCallUpdate {
    tool_call_id: "call_123",
    tool_call: ToolCall {
        id: "call_123",
        status: ToolCallStatus::Error,
        content: vec![
            ContentBlock::Text { text: "Error output" },
        ],
        error: Some("File not found".to_string()),
        // ...
    }
}

UI Implementation Guidelines:

Track tool calls by id in a HashMap
On ToolCall: create new entry
On ToolCallUpdate: replace existing entry (not delta)
Display status with appropriate visual indicators
Show content blocks as streaming output
Display error when status is Error
Use title for tool call heading if available

Typical Message Flow

Here's a complete example showing a typical agent interaction:

// 1. User sends message
Event::SessionUpdate {
    session_id: "session_123",
    update: SessionUpdate::UserMessageChunk {
        message_id: "msg_user_1",
        content: ContentBlock::Text {
            text: "Read and summarize config.toml",
        },
        _meta: None,
    },
}

// 2. Agent starts thinking
Event::SessionUpdate {
    session_id: "session_123",
    update: SessionUpdate::AgentThoughtChunk {
        message_id: "msg_agent_1",
        content: ContentBlock::Text {
            text: "I need to read the file first...",
        },
        _meta: None,
    },
}

// 3. Agent initiates tool call
Event::SessionUpdate {
    session_id: "session_123",
    update: SessionUpdate::ToolCall {
        message_id: "msg_agent_1",
        tool_call: ToolCall {
            id: "call_read_1",
            tool_name: "read_file",
            status: ToolCallStatus::Pending,
            content: vec![],
            raw_input: Some(json!({"path": "config.toml"})),
            title: Some("Read config.toml"),
            // ...
        },
        _meta: None,
    },
}

// 4. Tool starts executing
Event::SessionUpdate {
    session_id: "session_123",
    update: SessionUpdate::ToolCallUpdate {
        message_id: "msg_agent_1",
        tool_call_id: "call_read_1",
        tool_call: ToolCall {
            id: "call_read_1",
            status: ToolCallStatus::Running,
            // ...
        },
        _meta: None,
    },
}

// 5. Tool completes
Event::SessionUpdate {
    session_id: "session_123",
    update: SessionUpdate::ToolCallUpdate {
        message_id: "msg_agent_1",
        tool_call_id: "call_read_1",
        tool_call: ToolCall {
            id: "call_read_1",
            status: ToolCallStatus::Completed,
            content: vec![
                ContentBlock::Text {
                    text: "[port = 3000\n...]",
                },
            ],
            raw_output: Some(json!({"bytes_read": 1024})),
            // ...
        },
        _meta: None,
    },
}

// 6. Agent responds with summary
Event::SessionUpdate {
    session_id: "session_123",
    update: SessionUpdate::AgentMessageChunk {
        message_id: "msg_agent_1",
        content: ContentBlock::Text {
            text: "The config file sets the server port to 3000",
        },
        _meta: None,
    },
}

// 7. More response chunks...
Event::SessionUpdate {
    session_id: "session_123",
    update: SessionUpdate::AgentMessageChunk {
        message_id: "msg_agent_1",
        content: ContentBlock::Text {
            text: " and enables debug mode.",
        },
        _meta: None,
    },
}

Content vs MessagePart

Important Distinction:

ContentBlock: Streaming content representation (used in SessionUpdate)
- Designed for real-time rendering
- Granular updates
- MCP-compatible structure
MessagePart: Completed message content (legacy, still supported)
- Used in stored/completed messages
- May include additional fields for history
- Compatibility with existing code

Migration Path: The protocol supports both models. New code should prefer SessionUpdate with ContentBlock for streaming, while MessagePart remains available for compatibility and completed message storage.

Best Practices

For Consumers

Track by message_id: Group all chunks/updates for the same message
Handle tool calls separately: Maintain a HashMap of tool calls by tool_call_id
Replace, don't merge: ToolCallUpdate sends complete state, not deltas
Use _meta for debugging: Provider metadata helps with troubleshooting
Distinguish thoughts from messages: Render AgentThoughtChunk differently

For Adapters

Always include message_id: Every update must reference its message
Preserve provider info in _meta: Store original IDs for debugging
Send complete tool state: Include all tool call fields in updates
Use appropriate chunk types: User/Agent/Thought for correct semantics
Keep _meta minimal: Avoid large raw payloads in production

For UI Developers

Stream incrementally: Append chunks as they arrive
Show tool status visually: Use icons/colors for Pending/Running/Completed/Error
Make thoughts collapsible: Don't clutter the main conversation
Handle reconnection: Be prepared to receive full state snapshots
Display errors prominently: Show tool errors clearly to users

Future Extensions

The following features are planned but not yet implemented:

ResourceBlock: Embedded resource content (text/blob)
Image/Audio blocks: Rich media content
Plan updates: Agent planning and mode switching
Permissions: Request/reply for user permissions
Stop reasons: Detailed completion reasons

See the protocol roadmap for timeline and details.

13 KiB Raw Blame History

Dirigent Protocol Streaming Model

Overview

Core Concepts

SessionUpdate Events

ContentBlock Types

Provider Metadata (_meta)

SessionUpdate Variants

1. UserMessageChunk

2. AgentMessageChunk

3. AgentThoughtChunk

4. ToolCall

5. ToolCallUpdate

Tool Call Lifecycle

Typical Message Flow

Content vs MessagePart

Best Practices

For Consumers

For Adapters

For UI Developers

Future Extensions

See Also

13 KiB

Raw Blame History