Live Routing

req_8h72k→GPT-4o12ms

req_8h73k→claude-3.518ms

req_8h74k→gemini-2.08ms

Uptime

99.4%

Live Metrics

4,218req / min

31msavg latency

0.04%error rate

Budget Controls

Engineering$840 / $1k

Product$320 / $500

3 active policies · 0 violations

50+ AI Providers

OpenAIAnthropicGeminiMistralDeepSeekBedrockAzure AICohereLlama+ more

api.explane.ai/v1

Simple Pricing

Starter

$0/month

✓ 1M requests / month

✓ 10 providers

✓ Community support

Quick Start

// npm i @explane/sdk
import { Explane } from '@explane/sdk'

const ai = new Explane({ apiKey: "ex_..." })

await ai.route({ model: "auto", messages })

Routing

Every request to
the right model.

Explane evaluates cost, latency, and quality in real time — dispatching each AI call to the optimal provider. No config files. No code changes. Just results.

Book Demo

Live Routing Decisions < 2ms decision time

Request

Type

Tokens

→ Model

Reason

Cost

4,218 req / min

< 2ms decision

99.99% uptime

0 drops

Policy Engine

Routing rules that
think in conditions.

Write routing policies once. Explane evaluates them against every request in under 2ms — matching task type, cost budget, quality requirement, or any custom metadata you pass.

Condition

Volume

Routes to

Avg cost

task_type = "summarize"

1,248/hr

→claude-haiku

$0.00006

task_type = "code"

892/hr

→gpt-4o-mini

$0.00024

quality ≥ 0.90

2,104/hr

→gpt-4o

$0.0042

cost < $0.001

486/hr

→gemini-2.0-flash

$0.00019

DEFAULT

1,766/hr

→gpt-4o

$0.0038

Cost Optimization

Route by cost, latency,
and quality requirements.

A simple summarization task routes to a fast, affordable model. Complex reasoning goes to your flagship. Explane evaluates each request in real time — no rules to write, no dashboards to watch.

Optimize for cost, latency, or quality per request type
Set per-request cost ceilings and quality floors
Weighted multi-objective routing for mixed workloads

routing.policy.yaml

route:
  optimize: cost
  constraints:
    max_latency_ms: 500
    quality_min: 0.85
  prefer: anthropic
  fallback:
    - gemini-2.0-flash
    - gpt-4o-mini

→ claude-3-haiku-20240307 $0.00006 94ms ✓

High Availability

Never drop a request.
Ever.

When a provider rate-limits or goes down, Explane detects it within seconds and instantly reroutes to your fallback chain. Users see zero errors. Engineers sleep at night.

Automatic failover in under 50ms — no webhook needed
Configurable fallback chains per route, team, or model
Exponential backoff with jitter on retries

provider-failover · live incident triggered 2m ago

Normal

−4:20

Degraded

−2:41

Switch

−2:07

Stable

Now

Primary OpenAI GPT-4o Degraded

Fallback 1 Anthropic Claude ← routing here Active

Fallback 2 Google Gemini 2.0 Healthy

Auto-switched 127s ago · 0 requests dropped · 0 user errors

Scale Without Limits

Distribute load across
providers and API keys.

Stay under rate limits by spreading requests across multiple API keys and provider accounts. Scale to millions of requests per hour without engineering custom solutions.

Multi-key load balancing with rate-limit awareness
Weighted distribution — give preferred models more traffic
Priority queue: never let background jobs starve your users

Traffic distribution · last 60s live

OpenAI GPT-4o48%

2,024 req / min

Anthropic Claude 3.533%

1,392 req / min

Google Gemini 2.019%

802 req / min

Instant Integration

One endpoint.
All your providers.

Point your existing OpenAI SDK at api.explane.ai/v1 and you're done. No new SDKs. No refactoring. Explane is fully OpenAI API compatible.

OpenAI-compatible — works with any existing client
Two-line migration: swap api_key and base_url
Node.js, Python, Go, REST — all covered

Before

# Your existing setup, unchanged
client = OpenAI(
api_key="sk-proj-..."
)

After — that's it

client = OpenAI(
api_key="ex_live_...",
base_url="https://api.explane.ai/v1"
)
# All 50+ providers, automatic routing,
# failover, and observability — unlocked.

Every request tothe right model.

Routing rules thatthink in conditions.

Route by cost, latency,and quality requirements.

Never drop a request.Ever.

Distribute load acrossproviders and API keys.