LatentKit

Streaming

Stream chat and completion responses over Server-Sent Events (SSE).

Enable streaming by setting stream: true on supported endpoints such as POST /v1/chat.

REST example

curl https://ai.latentkit.com/v1/chat \
  -H "Authorization: Bearer $LATENTKIT_API_KEY" \
  -H "Content-Type: application/json" \
  -N \
  -d '{
    "messages": [{ "role": "user", "content": "Count from one to five." }],
    "stream": true,
    "max_tokens": 100
  }'

The response is text/event-stream with data: lines containing JSON chunks.

Terminal errors

If a terminal error occurs after the stream starts, LatentKit emits:

event: error
data: {"error":"budget_exceeded","message":"...","code":"budget_exceeded"}

Parse both normal data: chunks and event: error events.

JavaScript SDK

for await (const event of client.chat.stream({
  messages: [{ role: 'user', content: 'Count from one to five.' }],
})) {
  if (event.event === 'error') throw new Error(JSON.stringify(event.data));
  if (event.isDone) break;
}

Python SDK

for event in client.chat.stream(
    messages=[{"role": "user", "content": "Count from one to five."}],
):
    if event.event == "error":
        raise RuntimeError(event.data)
    if event.is_done:
        break

Notes

  • Streaming requests bypass response cache
  • Guardrails run before the stream starts; blocked requests return HTTP 400 with guardrail_blocked

On this page