Chat

POST /v1/chat — send messages and receive assistant text through the assigned route.

POST /v1/chat is the primary text endpoint for multi-turn conversations.

Request

curl https://ai.latentkit.com/v1/chat \
  -H "Authorization: Bearer $LATENTKIT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      { "role": "system", "content": "You are a concise assistant." },
      { "role": "user", "content": "Explain LatentKit in one sentence." }
    ],
    "response_profile": "balanced",
    "max_tokens": 200,
    "temperature": 0.2
  }'

Common fields

Field	Description
`messages`	OpenAI-style role/content array (required)
`max_tokens`	Output token limit
`temperature`	Sampling temperature when supported
`response_profile`	`fast`, `balanced`, or `deep` when route allows overrides
`stream`	Set `true` for SSE streaming — see Streaming
`tools` / `tool_choice`	Tool calling when route models support it

Route rules

Do not send model or provider. Tool-heavy, JSON-mode, vision, audio, and video workloads may require credits or BYOK even when plain chat is within a Free managed allowance.

SDK

const response = await client.chat.create({
  messages: [{ role: 'user', content: 'Hello' }],
  response_profile: 'balanced',
});

Errors

Routing exhaustion returns 503 with all_providers_exhausted. Budget blocks return typed JSON errors — see Error handling.

Request

Common fields

Route rules

SDK

Errors

On this page