Skip to content

Observability

Trace every model call, tool call, memory write, RAG retrieval, approval, cost line item, and artifact — and surface the operator views that vendor SDKs do not.

Observability is what powers governance, release management, cost control, and incident response. Generic tracing is not enough — Joch ships agent operations views in the language of agent fleets.

Trace events

Every Joch execution emits a structured event stream that extends OpenTelemetry and OCSF, in alignment with the OWASP AOS Trace specification:

ResourceApplied             a record was applied
AgentCompiled               an agent record was compiled into a manifest
ExecutionCreated            an execution was created
ExecutionScheduled          an execution was assigned to a worker
ExecutionStarted            a worker began execution
ModelCallStarted            a model call was sent to the provider
ModelCallCompleted          a model call returned
ToolCallRequested           a tool was about to be called (AOS hook: toolCallRequest)
ToolCallApproved            an approval was granted
ToolCallCompleted           a tool call returned (AOS hook: toolCallResult)
MemoryRead                  a memory store was read (AOS hook: memoryContextRetrieval)
MemoryWritten               a memory store was written (AOS hook: memoryStore)
KnowledgeRetrieved          a RAG retrieval returned (AOS hook: knowledgeRetrieval)
ArtifactCreated             a durable artifact was emitted
HookDecision                a Guardian Agent returned allow / deny / modify
PolicyDenied                a policy denied an action
ApprovalRequested           an approval was created
ApprovalGranted             an approval was granted
ApprovalDenied              an approval was denied
A2AMessageSent              an outbound A2A message was sent
A2AMessageReceived          an inbound A2A message was received
ExecutionSucceeded          an execution completed successfully
ExecutionFailed             an execution failed
BudgetExceeded              a cost / usage budget was exceeded
ProviderSwitched            a conversation switched providers
AgBOMUpdated                 the per-agent AgBOM was refreshed

Every event includes traceId, spanId, executionId, agentRef, agentVersion, framework, model, tenantId, and a payload appropriate to the event type.

OpenTelemetry and OCSF

Joch trace events extend industry-standard schemas, not proprietary ones:

  • OpenTelemetry — Spans use OTel semantic conventions where they exist (gen_ai.system, gen_ai.request.model, gen_ai.response.model, etc.) and add Joch-specific attributes (joch.agent.name, joch.framework, joch.policy.id, joch.tenant.id). See OpenTelemetry Mapping.
  • OCSF — Security-relevant events (policy denial, approval, A2A messages, AgBOM updates) emit OCSF-compatible records. See OCSF Mapping.

You can ship Joch traces into Grafana, Honeycomb, Datadog, Splunk, Elastic, your SIEM, or any backend that consumes OTLP and OCSF.

Operator views

Generic tracing answers "what happened in this span." Joch additionally answers operator-language questions:

joch top agents                    # most active agents by execution count / cost
joch top tools                     # most-called tools, with success rate
joch top models                    # most-used models, by cost and latency
joch cost by-team                  # cost rollups per team / namespace
joch cost by-agent --since 7d
joch trace exec-123                # full execution trace
joch incidents ls                  # current incidents flagged by alerts
joch drift detect --agent X        # output drift after a deployment
joch denials ls --policy P         # recent policy denials
joch approvals ls                  # pending approvals

The web console exposes the same views with charts, filters, and drill-downs.

Cost accounting

Cost is a first-class observability primitive. Every model call, tool call, and approval has an attributed cost. Joch rolls up cost per:

Agent
Execution
Conversation
Tool
Model
Team
Environment
Time window

Costs feed Budget enforcement. A budget breach can either alert, soft-cap, or hard-deny depending on policy.

Quality observability

Beyond cost and latency, Joch tracks agent quality:

  • Tool failure rate per tool, per agent, per environment.
  • Model fallback rate (how often ModelRoute fell off the primary).
  • RAG retrieval quality scores (top-k, judge-scored relevance, citation rate).
  • Memory write volume and growth.
  • Guardrail hit rate (AOS hook decisions: allow vs. modify vs. deny).
  • Approval bottleneck (queue depth, time-to-decision).
  • Prompt / version change correlation with regression.

These metrics feed the Release Management pillar.

Audit trail

Every decision is replayable. Given an execution ID, an operator can reconstruct:

inputs
retrieved knowledge (with citations)
memory reads
model calls (prompts, responses, tokens)
tool calls (args, side-effect class, approval, result)
memory writes
hook decisions (allow / deny / modify, by rule and policy version)
artifacts produced
costs charged
provider switches
A2A messages
final outputs

This is the foundation of compliance: every action has a who, when, what, why, and the policy version active at the time. See Trace.

Acceptance criteria

A team operating Joch's Observability pillar can:

  • replay any execution from inputs to outputs, with policy version, cost, latency, and hook decisions,
  • spot a tool failure rate spike within minutes and trace it to a specific MCP server version,
  • prove that no email.send call left the gateway in the last 90 days without a recorded approval,
  • export the full trace stream to OpenTelemetry and the security-relevant subset to OCSF.