Back to Blog
·5 min read·case study

How Model Routing Can Cut Support Bot Costs by 75%

An illustrative scenario showing how a support bot using GPT-4o for every query could cut monthly spend from $3,200 to $800 with simple model routing.

Consider a typical scenario: a customer support bot using GPT-4o for all queries at a monthly cost of around $3,200. What would happen if you applied AgentBurn's cost visibility to find optimization opportunities?

The Likely Discovery

Based on typical support bot workloads, a cost dashboard would likely reveal:

  1. A large portion of queries — often 60-80% — are simple FAQ-type questions (password resets, billing inquiries, feature locations)
  2. System prompts tend to grow over time, inflating per-call token counts

When simple queries are sent to an expensive model, you're paying premium prices for commodity work.

The Optimization

Step 1: Classify incoming queries using GPT-4o-mini (cost: ~$0.0002 per classification). Route simple queries to GPT-4o-mini, complex ones to GPT-4o.

Step 2: Split the system prompt into a base prompt (~200 tokens) and context modules loaded on demand. Simple queries get the base prompt only.

Projected Impact

Based on published model pricing (GPT-4o-mini is ~17x cheaper than GPT-4o for input tokens), a scenario like this could yield:

  • Before: ~$3,200/month (100% GPT-4o)
  • After: ~$800/month (majority GPT-4o-mini, remainder GPT-4o)
  • Estimated savings: ~75% reduction

The key insight isn't the routing itself — it's the visibility. Without per-query cost tracking, you have no way to know that most queries don't need an expensive model. A cost dashboard makes the waste obvious.

case-studycustomer-supportoptimizationmodel-routing

Start tracking your AI agent costs

Open-source. Self-hosted. Free forever for the core engine.

Related Articles