OpenAI Provider¶
The OpenAI provider routes calls to the OpenAI API: chat.completions and the newer responses API. Used by every Joch agent that selects an openai:* model.
Authentication¶
apiVersion: ops.joch.dev/v1alpha1
kind: Secret
metadata: { name: openai-api-key }
spec:
source:
type: env
name: OPENAI_API_KEY
The router resolves the key at the call boundary; values never enter the resource store, the trace, or the AgBOM.
For Azure OpenAI deployments, use the Microsoft Foundry / Azure OpenAI provider instead.
Capability vector¶
capabilities:
text: true
vision: true
audio: true # for gpt-realtime / multimodal models
toolCalling: true
structuredOutput: true
jsonSchema: true
streaming: true
reasoning: true # for `o*` and `gpt-5-thinking`
computerUse: true # for computer-use-preview models
embeddings: true
Tool calling¶
OpenAI's tool calls map 1:1 to Joch's ToolCall. The Joch adapter:
- forwards
tools=[...]andtool_choice=...to the provider, - applies AOS
steps/toolCallRequestto the inputs before submission, - intercepts tool-use messages, executes them through the Tool Gateway, and emits
steps/toolCallResult.
Structured outputs¶
response_format={"type": "json_schema", "json_schema": {...}} is preserved end-to-end. Schema validation runs after the provider response and before the agent receives it.
Streaming¶
Chunked Server-Sent Events. The Joch adapter forwards chunks unchanged, with joch.modelroute.fallback_index carried as a span attribute on the parent span (not per chunk).
Region / residency¶
OpenAI does not publish regional endpoints with strong residency guarantees. To enforce EU residency, use Azure OpenAI through the Microsoft Foundry provider, or use a local model.
Fallback behavior¶
Joch falls forward off OpenAI when:
- the API returns 5xx, 429 (rate-limit), or 408,
- the model rejects the request as unsupported (capability mismatch),
- a
Budgetrule fires.
Failover behavior is governed by the ModelRoute failover block.
Cost reporting¶
The provider returns token counts. Joch computes costUsd from the configured Model.spec.pricing block:
Update the pricing block when OpenAI changes prices; the router uses the values active at the time of the call.
Known limits¶
- Realtime / Voice models require the
realtimeadapter (separate from the standard chat adapter). - The Responses API and Chat Completions API have slightly different semantics; the adapter normalizes both into the canonical state.