Back to Blog
·6 min read·guide

How to Reduce Anthropic API Costs by 40% with Smart Model Routing

Not every task needs Claude Opus. Learn how to implement a model routing strategy that sends tasks to the cheapest model that can handle them.

Claude Opus is brilliant. It's also expensive. At $15/million input tokens and $75/million output tokens, using it for every task is like taking a helicopter to the grocery store.

The Model Tier Strategy

Anthropic offers three tiers with dramatically different pricing:

ModelInput (per 1M)Output (per 1M)Best For
Claude Haiku$0.25$1.25Classification, extraction, simple Q&A
Claude Sonnet$3.00$15.00Analysis, coding, multi-step reasoning
Claude Opus$15.00$75.00Complex research, long-form writing, nuanced tasks

Haiku is 60x cheaper than Opus for input tokens. For many agent tasks — intent classification, entity extraction, simple summarization — Haiku performs just as well.

Implementing a Router

The simplest approach: use a cheap model to classify task complexity, then route to the appropriate tier.

# Use Haiku to classify, then route
classification = call_haiku("Classify this task as simple/medium/complex: " + task)

model_map = {
    "simple": "claude-3-5-haiku-20241022",
    "medium": "claude-sonnet-4-20250514",
    "complex": "claude-3-opus-20240229"
}

result = call_model(model_map[classification], task)

The classification call costs fractions of a cent and saves dollars on every task that doesn't need Opus.

Measuring the Impact

Track costs per model in AgentBurn before and after implementing routing. Based on published pricing differences, even basic two-tier routing (Haiku + Sonnet) can reduce Anthropic spend substantially — the exact savings depend on your task distribution.

The dashboard's model breakdown chart makes this immediately visible — you'll see your cost distribution shift from a single expensive model to a mix of cheap and expensive, with the total trending down.

anthropicclaudeoptimizationmodel-routing

Start tracking your AI agent costs

Open-source. Self-hosted. Free forever for the core engine.

Related Articles