Files
2026-05-08 01:59:04 +02:00

477 lines
13 KiB
Markdown

# Dirigent Protocol Streaming Model
## Overview
The Dirigent Protocol uses an **ACP-style streaming model** built around `SessionUpdate` events. This model provides granular, real-time updates during agent interactions, enabling responsive UIs and structured content representation.
Version: 0.2.0
## Core Concepts
### SessionUpdate Events
All streaming content is delivered through `SessionUpdate` variants wrapped in `Event::SessionUpdate`:
```rust
pub enum Event {
// ... other events
SessionUpdate {
session_id: String,
update: SessionUpdate,
},
}
```
The `SessionUpdate` enum contains five variants for different types of streaming updates:
```rust
pub enum SessionUpdate {
UserMessageChunk { message_id: String, content: ContentBlock, _meta: Option<Meta> },
AgentMessageChunk { message_id: String, content: ContentBlock, _meta: Option<Meta> },
AgentThoughtChunk { message_id: String, content: ContentBlock, _meta: Option<Meta> },
ToolCall { message_id: String, tool_call: ToolCall, _meta: Option<Meta> },
ToolCallUpdate { message_id: String, tool_call_id: String, tool_call: ToolCall, _meta: Option<Meta> },
}
```
### ContentBlock Types
Content is represented using structured `ContentBlock` variants:
```rust
pub enum ContentBlock {
Text { text: String },
ResourceLink {
uri: String,
name: Option<String>,
mime_type: Option<String>,
},
// Future: Resource, Image, Audio (marked as out-of-scope for phase 1)
}
```
**Key Points:**
- `Text` is the primary content type for all textual output
- `ResourceLink` represents file references without embedding full content
- Future expansions will include embedded resources, images, and audio
### Provider Metadata (_meta)
All `SessionUpdate` variants support optional `_meta` fields for provider-specific information:
```rust
pub struct Meta {
pub provider: Option<ProviderMeta>,
pub extra: HashMap<String, Value>, // Arbitrary additional fields
}
pub struct ProviderMeta {
pub name: String, // e.g., "opencode", "anthropic"
pub original_ids: Option<HashMap<String, String>>, // Original provider IDs
pub raw_excerpt: Option<Value>, // Minimal raw payload for debugging
}
```
**Usage:**
- Adapters populate `_meta` to preserve provider-specific information
- Consumers can use this for debugging, telemetry, or provider-specific features
- The `extra` map allows arbitrary fields for forward compatibility
## SessionUpdate Variants
### 1. UserMessageChunk
Represents streaming chunks of user message content.
```rust
SessionUpdate::UserMessageChunk {
message_id: "msg_abc123".to_string(),
content: ContentBlock::Text {
text: "What's the capital of France?".to_string(),
},
_meta: None,
}
```
**When to use:**
- Streaming user input being typed
- Echo of user input from server
- Multi-part user messages being assembled
### 2. AgentMessageChunk
Represents streaming chunks of agent response content.
```rust
SessionUpdate::AgentMessageChunk {
message_id: "msg_def456".to_string(),
content: ContentBlock::Text {
text: "The capital of France is ".to_string(),
},
_meta: Some(Meta {
provider: Some(ProviderMeta {
name: "opencode".to_string(),
original_ids: Some(HashMap::from([
("message_id".to_string(), "original_123".to_string()),
])),
raw_excerpt: None,
}),
extra: HashMap::new(),
}),
}
```
**When to use:**
- Agent's response text being generated
- Final answer content
- Any visible agent output
**Key distinction from AgentThoughtChunk:**
- `AgentMessageChunk`: Visible output intended for the user
- `AgentThoughtChunk`: Internal reasoning, typically hidden or collapsible
### 3. AgentThoughtChunk
Represents streaming chunks of agent internal reasoning (thinking, planning).
```rust
SessionUpdate::AgentThoughtChunk {
message_id: "msg_ghi789".to_string(),
content: ContentBlock::Text {
text: "I need to look up Paris in my knowledge base...".to_string(),
},
_meta: None,
}
```
**When to use:**
- Agent's internal reasoning process
- "Chain of thought" content
- Planning or decision-making process
- Content typically displayed in collapsible sections
**UI Conventions:**
- Often hidden by default or shown in a separate "Thinking" section
- May be styled differently (e.g., italics, muted colors)
- Can be collapsed to save screen space
### 4. ToolCall
Represents the initiation or current state of a tool call.
```rust
SessionUpdate::ToolCall {
message_id: "msg_jkl012".to_string(),
tool_call: ToolCall {
id: "call_xyz789".to_string(),
tool_name: "read_file".to_string(),
status: ToolCallStatus::Pending,
content: vec![],
raw_input: Some(json!({
"file_path": "/path/to/file.txt"
})),
raw_output: None,
title: Some("Read file.txt".to_string()),
error: None,
metadata: None,
},
_meta: None,
}
```
**When to use:**
- Tool call is first initiated
- Sending a snapshot of current tool state
- Re-sending full tool state after reconnection
**ToolCallStatus Lifecycle:**
- `Pending` → Tool call created but not yet executing
- `Running` → Tool call actively executing
- `Completed` → Tool call finished successfully
- `Error` → Tool call failed
### 5. ToolCallUpdate
Represents an update to an existing tool call (status change, new content).
```rust
SessionUpdate::ToolCallUpdate {
message_id: "msg_jkl012".to_string(),
tool_call_id: "call_xyz789".to_string(),
tool_call: ToolCall {
id: "call_xyz789".to_string(),
tool_name: "read_file".to_string(),
status: ToolCallStatus::Running,
content: vec![
ContentBlock::Text {
text: "Reading file...".to_string(),
},
],
raw_input: Some(json!({
"file_path": "/path/to/file.txt"
})),
raw_output: None,
title: Some("Read file.txt".to_string()),
error: None,
metadata: Some(json!({
"bytes_read": 1024
})),
},
_meta: None,
}
```
**When to use:**
- Tool status changes (Pending → Running → Completed/Error)
- New output content available
- Progress updates
- Error state reached
**Note:** The full `ToolCall` is sent each time, not a delta. Consumers should replace the previous tool call state with the new one.
## Tool Call Lifecycle
Understanding the tool call lifecycle is essential for proper UI implementation:
```rust
// 1. Tool call initiated
SessionUpdate::ToolCall {
tool_call: ToolCall {
id: "call_123",
status: ToolCallStatus::Pending,
content: vec![],
// ...
}
}
// 2. Tool starts executing
SessionUpdate::ToolCallUpdate {
tool_call_id: "call_123",
tool_call: ToolCall {
id: "call_123",
status: ToolCallStatus::Running,
content: vec![],
// ...
}
}
// 3. Tool produces output
SessionUpdate::ToolCallUpdate {
tool_call_id: "call_123",
tool_call: ToolCall {
id: "call_123",
status: ToolCallStatus::Running,
content: vec![
ContentBlock::Text { text: "Output line 1" },
],
// ...
}
}
// 4a. Tool completes successfully
SessionUpdate::ToolCallUpdate {
tool_call_id: "call_123",
tool_call: ToolCall {
id: "call_123",
status: ToolCallStatus::Completed,
content: vec![
ContentBlock::Text { text: "Output line 1" },
ContentBlock::Text { text: "Done!" },
],
raw_output: Some(json!({"success": true})),
// ...
}
}
// 4b. Or tool fails with error
SessionUpdate::ToolCallUpdate {
tool_call_id: "call_123",
tool_call: ToolCall {
id: "call_123",
status: ToolCallStatus::Error,
content: vec![
ContentBlock::Text { text: "Error output" },
],
error: Some("File not found".to_string()),
// ...
}
}
```
**UI Implementation Guidelines:**
- Track tool calls by `id` in a HashMap
- On `ToolCall`: create new entry
- On `ToolCallUpdate`: replace existing entry (not delta)
- Display status with appropriate visual indicators
- Show `content` blocks as streaming output
- Display `error` when status is `Error`
- Use `title` for tool call heading if available
## Typical Message Flow
Here's a complete example showing a typical agent interaction:
```rust
// 1. User sends message
Event::SessionUpdate {
session_id: "session_123",
update: SessionUpdate::UserMessageChunk {
message_id: "msg_user_1",
content: ContentBlock::Text {
text: "Read and summarize config.toml",
},
_meta: None,
},
}
// 2. Agent starts thinking
Event::SessionUpdate {
session_id: "session_123",
update: SessionUpdate::AgentThoughtChunk {
message_id: "msg_agent_1",
content: ContentBlock::Text {
text: "I need to read the file first...",
},
_meta: None,
},
}
// 3. Agent initiates tool call
Event::SessionUpdate {
session_id: "session_123",
update: SessionUpdate::ToolCall {
message_id: "msg_agent_1",
tool_call: ToolCall {
id: "call_read_1",
tool_name: "read_file",
status: ToolCallStatus::Pending,
content: vec![],
raw_input: Some(json!({"path": "config.toml"})),
title: Some("Read config.toml"),
// ...
},
_meta: None,
},
}
// 4. Tool starts executing
Event::SessionUpdate {
session_id: "session_123",
update: SessionUpdate::ToolCallUpdate {
message_id: "msg_agent_1",
tool_call_id: "call_read_1",
tool_call: ToolCall {
id: "call_read_1",
status: ToolCallStatus::Running,
// ...
},
_meta: None,
},
}
// 5. Tool completes
Event::SessionUpdate {
session_id: "session_123",
update: SessionUpdate::ToolCallUpdate {
message_id: "msg_agent_1",
tool_call_id: "call_read_1",
tool_call: ToolCall {
id: "call_read_1",
status: ToolCallStatus::Completed,
content: vec![
ContentBlock::Text {
text: "[port = 3000\n...]",
},
],
raw_output: Some(json!({"bytes_read": 1024})),
// ...
},
_meta: None,
},
}
// 6. Agent responds with summary
Event::SessionUpdate {
session_id: "session_123",
update: SessionUpdate::AgentMessageChunk {
message_id: "msg_agent_1",
content: ContentBlock::Text {
text: "The config file sets the server port to 3000",
},
_meta: None,
},
}
// 7. More response chunks...
Event::SessionUpdate {
session_id: "session_123",
update: SessionUpdate::AgentMessageChunk {
message_id: "msg_agent_1",
content: ContentBlock::Text {
text: " and enables debug mode.",
},
_meta: None,
},
}
```
## Content vs MessagePart
**Important Distinction:**
- **ContentBlock**: Streaming content representation (used in `SessionUpdate`)
- Designed for real-time rendering
- Granular updates
- MCP-compatible structure
- **MessagePart**: Completed message content (legacy, still supported)
- Used in stored/completed messages
- May include additional fields for history
- Compatibility with existing code
**Migration Path:** The protocol supports both models. New code should prefer `SessionUpdate` with `ContentBlock` for streaming, while `MessagePart` remains available for compatibility and completed message storage.
## Best Practices
### For Consumers
1. **Track by message_id**: Group all chunks/updates for the same message
2. **Handle tool calls separately**: Maintain a HashMap of tool calls by `tool_call_id`
3. **Replace, don't merge**: `ToolCallUpdate` sends complete state, not deltas
4. **Use _meta for debugging**: Provider metadata helps with troubleshooting
5. **Distinguish thoughts from messages**: Render `AgentThoughtChunk` differently
### For Adapters
1. **Always include message_id**: Every update must reference its message
2. **Preserve provider info in _meta**: Store original IDs for debugging
3. **Send complete tool state**: Include all tool call fields in updates
4. **Use appropriate chunk types**: User/Agent/Thought for correct semantics
5. **Keep _meta minimal**: Avoid large raw payloads in production
### For UI Developers
1. **Stream incrementally**: Append chunks as they arrive
2. **Show tool status visually**: Use icons/colors for Pending/Running/Completed/Error
3. **Make thoughts collapsible**: Don't clutter the main conversation
4. **Handle reconnection**: Be prepared to receive full state snapshots
5. **Display errors prominently**: Show tool errors clearly to users
## Future Extensions
The following features are planned but not yet implemented:
- **ResourceBlock**: Embedded resource content (text/blob)
- **Image/Audio blocks**: Rich media content
- **Plan updates**: Agent planning and mode switching
- **Permissions**: Request/reply for user permissions
- **Stop reasons**: Detailed completion reasons
See the protocol roadmap for timeline and details.
## See Also
- [Migration from 0.1.x](migration_from_0.1.md) - Upgrading from older versions
- [CHANGELOG.md](../CHANGELOG.md) - Version history and breaking changes
- [examples/](../examples/) - Code examples demonstrating usage