Quality-graded routing — not a blind cheap-model swap.

Frontier quality. Open-source prices.

OrcaRouter grades every prompt and routes it — hard reasoning to frontier models, routine work to open-source — so you keep frontier answer quality while cutting spend ~40%. Quality-checked, never a blind downgrade. Zero token markup. One line to switch.

- client = OpenAI(api_key="sk-...")
+ client = OpenAI(
+ base_url="https://api.orcarouter.ai/v1",
+ api_key="sk-orca-..."
+ )
 
# Everything else stays the same.
response = client.chat.completions.create(
model="orcarouter/auto", # router picks the best model per request
messages=[{"role": "user", "content": "..."}]
)
# → orcarouter/auto grades the prompt → frontier or open-source, zero token markup ✓
claude-opus-4-7$5.00 in·$25.00 outAnthropic Direct
claude-sonnet-4-6$3.00 in·$15.00 outAnthropic Direct
gpt-5.5$5.00 in·$30.00 outOpenAI Direct
gemini-3.1-pro-preview$4.00 in·$18.00 outGoogle Direct
deepseek-v4-pro$0.560 in·$1.12 outDeepSeek
qwen3.6-plus$0.500 in·$3.00 outAlibaba Cloud
kimi-k2.6$0.900 in·$3.75 outMoonshot
seedance-2.0from $0.07 /sec·ByteDance
claude-opus-4-7$5.00 in·$25.00 outAnthropic Direct
claude-sonnet-4-6$3.00 in·$15.00 outAnthropic Direct
gpt-5.5$5.00 in·$30.00 outOpenAI Direct
gemini-3.1-pro-preview$4.00 in·$18.00 outGoogle Direct
deepseek-v4-pro$0.560 in·$1.12 outDeepSeek
qwen3.6-plus$0.500 in·$3.00 outAlibaba Cloud
kimi-k2.6$0.900 in·$3.75 outMoonshot
seedance-2.0from $0.07 /sec·ByteDance
0%
routing markup. ever.
~40%
less spend — same answer quality
200+
models, one endpoint
<1ms
routing overhead added
Building on this? Talk to us.
Feedback shapes the next ship.
How routing works

Routing you can see and control.

Graded, then direct

Every prompt is graded, then sent straight to the chosen model's first-party provider — no reseller in the middle. The grade, model, and provider are shown on each request, so a route is never a black box.

Provider terms apply end-to-end

Each upstream provider's data and usage terms apply directly to your traffic. Pin routing to the models and providers that match your policy.

🧾

Per-request auditability

Every call records the difficulty grade, the model chosen, the provider, and the published price. Reproduce any routing decision later from the dashboard.


Setup

Live in 60 seconds.

One URL change. Your existing SDK, model names, and streaming all work exactly as before.

Step 1
🔗

Point your SDK at us

Set base_url to api.orcarouter.ai/v1 and swap your API key. No other code changes needed.

Step 2

We grade & route

Each prompt is graded for difficulty in under 1ms, then sent to the model that holds answer quality at the lowest cost — frontier for hard reasoning, open-source for routine work.

Step 3

You pay provider cost

Whichever model is chosen, traffic goes direct to its first-party provider at their published rate. We add exactly $0 per token — our fee is on the plan, not your usage.


Live pricing

Every model.
Best available rate.

Quality-graded routing to the right model — frontier or open-source. Live prices refresh every 60s.

View all 200+ models →
ModelRouted toInput /MOutput /MContextQuality
claude-opus-4-7Anthropic Direct$5.00$25.001M10.0
claude-sonnet-4-6Anthropic Direct$3.00$15.001M7.0
gpt-5.5OpenAI Direct$5.00$30.001M10.0
gemini-3.1-pro-previewGoogle Direct$4.00$18.001M10.0
deepseek-v4-proDeepSeek$0.560$1.121M9.0
qwen3.6-plusAlibaba Cloud$0.500$3.001M8.0
kimi-k2.6Moonshot$0.900$3.75256K9.0
seedance-2.0ByteDancefrom $0.07 /sec10.0
+ 194 more models · Prices update every 60 seconds
Platform

Production-grade from day one.

Everything you need to run AI in production without managing multiple provider integrations.

Quality-graded routing

Every prompt is graded and sent to the model that holds quality at the lowest cost — frontier when it's hard, open-source when it's routine. Automatic.

Automatic failover

Provider goes down mid-stream? We switch transparently. Your app sees zero errors.

🔑

API key management

Issue keys per team or service with spend caps, model allowlists, and rate limits built in.

$

Per-request cost tracking

See what every request cost, the model and provider that handled it, and how much grading saved you.

OpenAI-compatible

Change one line. Same SDK, same model names, same streaming format. Zero migration effort.

🛡

Budget enforcement

Hard and soft limits per key, team, or org. Auto-resets monthly. Slack + webhook alerts.


Differentiator

Glass-box routing.

Every request shows the difficulty grade, the model chosen, the provider that served it, and the published price. Verifiable per call, reproducible later.

🔍

Per-request route attribution

Each completion is tagged with the difficulty grade, the model picked, and its first-party provider — Anthropic Direct, OpenAI Direct, Bedrock, Vertex — surfaced in your dashboard and headers.

📒

Published-rate ledger

Every token charge equals the provider's public list price. Audit any request against the provider's own pricing page in seconds.

Routing decisions you can replay

Difficulty grades, model choices, failover events, and health swaps are logged with timestamps. Reproduce any request's routing path.


Pricing

Routing is free.
Pay for features.

We never take a cut of your token spend. Our revenue comes from optional team features.

Zero markup guarantee
You pay providers directly at their published rates. We add nothing on top of token costs. Routing is free; the optional Team plan funds the platform.
$0.00routing fee

Hacker

Free
Forever. Zero markup on all tokens.
✓ 3 API keys
✓ All 200+ models
✓ Auto failover
✓ Basic dashboard
Start free

Enterprise

Custom
SLA commitments + private deployment.
✓ Private deployment option
✓ Custom routing rules
✓ 99.99% uptime SLA
✓ Dedicated support
✓ Audit logs & compliance
Trust & Compliance

Independently audited. Continuously compliant.

Audit reports available under NDA — request a copy below.

Start routing smarter.

Sign up with GitHub — $5 in tokens free. No credit card required. Swap one line of code and you're live.