Observability

Every AI request,
completely visible.

Full-stack observability for your AI layer. Trace requests end-to-end, understand cost per call, catch latency regressions before users do.

Live Request Trace 4,218 req / min
Request
POST
/v1/chat/completions
Explane
Routing
8ms · 4 policies
Provider
gpt-4o
OpenAI · tier-2
Latency
187ms
p99: 312ms
Cost
$0.0024
1,248 tokens
Response
200 OK
success · cached
4,218 req / min
31ms avg latency
99.99% uptime
0.04% error rate
What we capture

Complete signal,
zero blind spots.

Every dimension of your AI traffic — latency, cost, token counts, provider health, routing decisions, and a full audit trail — captured automatically, no instrumentation required.

Latency · p50 / p95 / p99
31ms p50
p95: 89ms · p99: 312ms
↓ 12ms from last week
Token Usage
2.4B
tokens · this month
↑ 18% growth
Total Cost
$1,840
this month · all providers
↑ 6% vs last month
Provider Health
3 / 3
providers operational
Last incident: 14 days ago
Model Mix
gpt-4o
top model · 62% of traffic
+claude-3.5 · +gemini-2.0
Error Rate
0.04%
429s + 5xx this hour
↓ 0.02% vs baseline
Routing Decisions
98.2%
policy match rate
4 active rules
12:34:01 sarah@acme.co Updated budget policy · Engineering team limit $1,000 → $1,500 governance
11:52:44 api:ex_k9a2 Failover triggered · azure-gpt-4 503 → gemini-2.0-flash routing
09:14:22 james@acme.co Added provider · Mistral Large 2 · embedding + chat endpoints integrations
Request Timeline

Distributed tracing,
built for AI.

Every request generates a complete waterfall trace — provider call time, token counting, policy evaluation, streaming overhead. Find exactly where your latency is going.

  • Span-level breakdown: routing, inference, streaming, post-processing
  • Compare traces across providers to surface performance regressions
  • Export to Datadog, Grafana, or any OpenTelemetry-compatible backend
trace_9f2b · req_8h74k 264ms total
0ms64ms128ms192ms264ms
sdk.request
264ms
policy.eval
14ms
route.select
8ms
gpt-4o.call
156ms
stream.out
76ms
token.count
8ms
Total wall time 264ms
Cost Intelligence

Know exactly what
you're spending, and why.

Real-time cost attribution per request, model, team, and feature. Set budgets, get alerts at 80%, and let Explane automatically route to cheaper models when you're approaching limits.

  • Per-request cost breakdown: prompt tokens, completion tokens, markup
  • Tag requests by team, product feature, or user segment for chargebacks
  • Projected monthly spend with confidence intervals
Spend by provider June 2025 · $1,840 total
$124
OpenAI
$74
Anthropic
$49
Gemini
$29
Mistral
$19
Bedrock
OpenAI · gpt-4o + gpt-4o-mini 67% $1,232
Anthropic · claude-3.5-sonnet 22% $405
Others · Gemini, Mistral, Bedrock 11% $203
Live Operations

A real-time view of
every request in flight.

The operations wall shows your AI traffic as it happens — model calls, token counts, latencies, error codes. Filter by provider, team, or status in under a second.

  • Sub-second latency on the live feed — no polling, pure WebSocket
  • One-click drill-down to the full trace for any log row
  • Search and filter by model, status code, team, or custom metadata
explane · live feed
Intelligent Alerting

Catch regressions
before your users do.

Explane watches latency, error rates, cost anomalies, and provider health continuously. Smart alerts fire only when they matter — no alert fatigue, just signal.

  • Threshold, anomaly-detection, and budget-based alert types
  • Route alerts to Slack, PagerDuty, email, or webhook
  • Automatic suppression for known maintenance windows
Alert Feed 2 ACTIVE
Latency spike · gpt-4o
avg 2.4s · threshold 1.0s · 18 requests affected
2 minutes ago
CRIT
Budget 84% consumed · Engineering
$840 of $1,000 · at current rate, limit in ~6 days
14 minutes ago
WARN
Rate limit approaching · OpenAI
tier-2 TPM at 78% · auto-scaling to tier-3
1 hour ago
WARN
Failover resolved · Azure GPT-4
azure-gpt-4 503 → gemini-2.0-flash · recovered in 4s
2 hours ago
OK
Cost anomaly resolved
embedding cost spike traced to batch job · closed
6 hours ago
OK

Start seeing everything.

Full observability in under 10 minutes. No agents, no SDKs required — just point your requests through Explane.