ProvaraDocs
Api

Chat completions

OpenAI-compatible /v1/chat/completions endpoint.

Provara's chat completions endpoint is a drop-in for any SDK that speaks OpenAI format.

Basic

curl -X POST https://gateway.provara.xyz/v1/chat/completions \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Passing "model": "" lets the adaptive router pick. Pass a specific model (e.g. "claude-sonnet-4-6") to pin.

Streaming

curl -X POST https://gateway.provara.xyz/v1/chat/completions \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "",
    "stream": true,
    "messages": [{"role": "user", "content": "Tell me a story"}]
  }'

Fully SSE-compatible. The final data: frame before [DONE] includes a _provara meta object with latency, cost, and routing info.

Hints

Provara extends the standard request body with optional hints the router uses when model="":

FieldValuesEffect
routing_hintcoding, creative, summarization, qa, generalOverrides the task-type classifier
complexity_hintsimple, medium, complexOverrides the complexity classifier
requires_structured_outputbooleanNarrows the candidate pool to models known to reliably follow JSON schemas. Auto-detected from response_format: { type: "json_schema" | "json_object" } or a non-empty tools array — only set explicitly to override an auto-detection. Pinned model bypasses.

Ignored when model is pinned.

Structured-output routing

When a request carries a JSON schema (response_format.type === "json_schema" or json_object) or a tools array, the router narrows its candidate pool to models listed in STRUCTURED_OUTPUT_RELIABLE. If no registered provider has a capable model, the request returns HTTP 502 no_capable_provider rather than silently routing to a model that will emit a plausible-but-wrong-shape response.

The current capable list: gpt-4o, gpt-4.1, gpt-4.1-mini, o3, o4-mini, claude-opus-4-6, claude-sonnet-4-6, gemini-2.5-pro. Unknown models default to "not capable" — the safe choice.

Response envelope

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1776534905,
  "model": "claude-haiku-4-5-20251001",
  "choices": [...],
  "usage": { "prompt_tokens": 10, "completion_tokens": 4, "total_tokens": 14 },
  "_provara": {
    "provider": "anthropic",
    "latencyMs": 400,
    "cached": false,
    "routing": {
      "taskType": "general",
      "complexity": "medium",
      "routedBy": "user-override",
      "usedFallback": false,
      "usedLlmFallback": false
    }
  }
}

_provara is unique to Provara and safe to ignore if your SDK parses strictly.

Response headers

HeaderPurpose
X-Provara-Request-IdThe request's id (use for replay / debugging)
X-Provara-ModelModel actually served
X-Provara-ProviderProvider actually served
X-Provara-CostCost in USD
X-Provara-LatencyLatency in ms
X-Provara-Cacheexact, semantic, or absent
X-Provara-GuardrailFired rule name, if any
X-Provara-ErrorsJSON of provider errors hit during fallback (debug only)
X-RateLimit-Limit / X-RateLimit-RemainingToken-scoped rate-limit state

Error contract

HTTPerror.typeWhen
401auth_errorMissing / invalid bearer or session
402insufficient_tierFeature gated by subscription
402budget_exceededTenant's budget hard-stop fired
402spend_limit_errorPer-token spend limit exceeded
429rate_limit_errorIP or token rate limit exhausted
429quota_exhaustedMonthly request quota hit (Free tier)
400guardrail_errorInput blocked by a guardrail rule
502provider_errorAll fallback providers failed