Response

Response Modes

`stream: "delta"`

Content-Type: text/event-stream. The server streams SSE events as they are produced. Text is sent as incremental text_delta events; thinking is sent as incremental thinking_delta events.

`stream: "message"`

Content-Type: text/event-stream. The server streams SSE events, but text is sent as a single complete text/thinking event per message rather than incremental deltas. Tool call events still arrive as they happen.

`stream: "none"` (default)

Content-Type: application/json. The server returns a single JSON response after the agent finishes. If the agent needs a client-side tool result, it returns a tool_use stop reason in the JSON body and the client re-submits with results — same flow as SSE, just without streaming.

SSE Events (`stream: "delta"` and `stream: "message"`)

Each event is a JSON object on the data: field.

`session_start`

The first event in a PUT /session stream, always preceding turn_start. Contains the sessionId the client must store for subsequent turns.

event: session_start
data: {"sessionId": "sess_abc123"}

`turn_start`

Marks the beginning of the agent's response. For PUT /session, emitted immediately after session_start. For POST /session/:id, this is the first event in the stream.

event: turn_start
data: {}

`text_delta`

(delta mode only) An incremental delta of the agent's text response. Only emitted between turn_start and turn_stop.

event: text_delta
data: {"delta": "The weather in Tokyo is..."}

`thinking_delta`

(delta mode only) An incremental delta of the agent's thinking/reasoning. Only emitted between turn_start and turn_stop.

event: thinking_delta
data: {"delta": "The user is asking about Tokyo weather, I should..."}

`text`

(message mode only) The complete agent text response. Only emitted between turn_start and turn_stop.

event: text
data: {"text": "The weather in Tokyo is 18°C, partly cloudy."}

`thinking`

(message mode only) The complete agent thinking/reasoning. Only emitted between turn_start and turn_stop.

event: thinking
data: {"thinking": "The user is asking about Tokyo weather, I should use the weather tool..."}

`tool_call`

Only emitted between turn_start and turn_stop. The agent wants to invoke a tool. Multiple tool_call events may be emitted before turn_stop — the client should collect all of them and handle in parallel.

For application-side tools, the client executes the tool and submits the results in a subsequent POST /session/:id request.

For server-side tools where trust: true, the server invokes the tool inline and emits a tool_result event with the result — no client round-trip needed. The agent continues streaming without stopping.

For server-side tools where trust: false, the server stops and the client submits a permission decision in a subsequent POST /session/:id request. The agent continues regardless — if denied, the LLM is informed the tool was not permitted.

The agent only emits turn_stop with stopReason: "tool_use" if there is at least one application-side tool call or one untrusted server-side tool call that requires client action. If all tool calls are trusted server-side tools, the agent handles them inline and continues without stopping.

The client must collect all application-side tool results and untrusted server-side tool permissions and submit them together in a single subsequent POST /session/:id request.

event: tool_call
data: {"toolCallId": "call_001", "name": "get_weather", "input": {"location": "Tokyo"}}

Tool names must be unique across application tools and agent tools in a single request. The client identifies whether a tool call is application-side or server-side by matching the name against its request.

`tool_result`

(server-side trusted tools only) Only emitted between turn_start and turn_stop. Emitted after the server executes a trusted tool inline. The agent continues streaming after this event.

event: tool_result
data: {"toolCallId": "call_001", "content": "Tokyo: 18°C, partly cloudy"}

`turn_stop`

Always the final event in the stream. Marks the end of the agent's response.

event: turn_stop
data: {"stopReason": "end_turn"}

Stop reasons:

`stopReason`	Meaning
`end_turn`	Agent finished normally
`tool_use`	Agent emitted one or more `tool_call` events requiring client action (application-side tool or untrusted server-side tool)
`max_tokens`	Hit token limit — agents should implement their own history compaction strategy to avoid this; if no compaction is in place and the context window overflows, this stop reason is returned
`refusal`	LLM refused to respond (e.g. safety policy)
`error`	Server encountered an error mid-stream

JSON Response (`stream: "none"`)

Normal response:

json

{
  "stopReason": "end_turn",
  "messages": [
    {
      "role": "assistant",
      "content": "The weather in Tokyo is 18°C, partly cloudy."
    }
  ]
}

PUT /session additionally includes sessionId:

json

{
  "sessionId": "sess_abc123",
  "stopReason": "end_turn",
  "messages": [
    {
      "role": "assistant",
      "content": "The weather in Tokyo is 18°C, partly cloudy."
    }
  ]
}

With thinking:

json

{
  "stopReason": "end_turn",
  "messages": [
    {
      "role": "assistant",
      "content": [
        {
          "type": "thinking",
          "thinking": "The user wants Tokyo weather. I should use the get_weather tool."
        },
        {
          "type": "text",
          "text": "The weather in Tokyo is 18°C, partly cloudy."
        }
      ]
    }
  ]
}

When an application-side tool is needed, or an untrusted server-side tool requires permission:

json

{
  "stopReason": "tool_use",
  "messages": [
    {
      "role": "assistant",
      "content": [
        {
          "type": "tool_use",
          "toolCallId": "call_001",
          "name": "get_weather",
          "input": { "location": "Tokyo" }
        }
      ]
    }
  ]
}

When a trusted server-side tool was called inline, the full exchange is included in the returned messages:

json

{
  "stopReason": "end_turn",
  "messages": [
    {
      "role": "assistant",
      "content": [
        {
          "type": "tool_use",
          "toolCallId": "call_002",
          "name": "web_search",
          "input": { "query": "Tokyo weather today" }
        }
      ]
    },
    {
      "role": "tool",
      "toolCallId": "call_002",
      "content": "Tokyo: 18°C, partly cloudy"
    },
    {
      "role": "assistant",
      "content": "The weather in Tokyo is 18°C, partly cloudy."
    }
  ]
}

Message Format

Messages follow OpenAI-compatible roles.

System message

json

{
  "role": "system",
  "content": "You are a helpful assistant that responds concisely."
}

User message

json

{ "role": "user", "content": "What's the weather in Tokyo?" }

content may be a string or an array of content blocks.

Assistant message

json

{
  "role": "assistant",
  "content": [
    {
      "type": "thinking",
      "thinking": "The user wants the weather in Tokyo. I should use the get_weather tool."
    },
    { "type": "text", "text": "Let me check that for you." },
    {
      "type": "tool_use",
      "toolCallId": "call_001",
      "name": "get_weather",
      "input": { "location": "Tokyo" }
    }
  ]
}

Tool result message

Used in two cases:

Trusted server-side tool: the agent executes the tool inline, stores the result in history, and includes it in the returned messages.
Application-side tool: the client executes the tool and submits the result via POST /session/:id.

json

{
  "role": "tool",
  "toolCallId": "call_001",
  "content": "Tokyo: 18°C, partly cloudy"
}

content may be a string or an array of content blocks.

Tool permission message

Used to submit a permission decision for an untrusted server-side tool call via POST /session/:id. The agent continues and informs the LLM of the decision. These messages are never stored in session history.

When granted: true, the agent executes the tool and stores the tool result in history. When granted: false, the agent stores a message in history indicating the tool was denied by the user. The client may include an optional reason string that the agent will relay to the LLM.

json

{ "role": "tool_permission", "toolCallId": "call_002", "granted": true }

json

{
  "role": "tool_permission",
  "toolCallId": "call_002",
  "granted": false,
  "reason": "User declined"
}

Response ​

Response Modes ​

stream: "delta" ​

stream: "message" ​

stream: "none" (default) ​

SSE Events (stream: "delta" and stream: "message") ​

session_start ​

turn_start ​

text_delta ​

thinking_delta ​

text ​

thinking ​

tool_call ​

tool_result ​

turn_stop ​

JSON Response (stream: "none") ​

Message Format ​

System message ​

User message ​

Assistant message ​

Tool result message ​

Tool permission message ​

Sequence Diagram ​

Response

Response Modes

`stream: "delta"`

`stream: "message"`

`stream: "none"` (default)

SSE Events (`stream: "delta"` and `stream: "message"`)

`session_start`

`turn_start`

`text_delta`

`thinking_delta`

`text`

`thinking`

`tool_call`

`tool_result`

`turn_stop`

JSON Response (`stream: "none"`)

Message Format

System message

User message

Assistant message

Tool result message

Tool permission message

Sequence Diagram