If your agents write and execute code, you're paying for two things: the LLM calls to generate code, and the compute to run it. Most cost tracking tools only see the first part. AgentBurn tracks both.

The E2B Cost Model

E2B charges per second of sandbox compute time. A sandbox that runs for 30 seconds costs significantly more than the LLM call that generated the code. For iterative code generation (write → run → fix → run), compute costs can exceed LLM costs.

Instrumenting E2B Costs

import e2b
import time

sandbox = e2b.Sandbox()
start = time.time()
result = sandbox.run_code(generated_code)
duration = time.time() - start

# Track the compute cost
ingest_event(
    agent_id="code-gen-agent",
    provider="e2b",
    operation="sandbox_execution",
    cost_usd=duration * E2B_COST_PER_SECOND,
    metadata=json.dumps({"duration_s": duration, "exit_code": result.exit_code})
)

Illustrative Cost Split

For a code generation agent that iterates until tests pass, the cost breakdown might look like:

LLM calls (code generation): Multiple iterations at a few cents per call
E2B sandbox (execution): Multiple runs at several seconds each — compute cost can match or exceed LLM cost

Without tracking E2B costs alongside LLM costs, you might think each task costs half of what it actually does. The compute portion is invisible without explicit tracking.

Optimization Strategies

Keep sandboxes warm — Reuse sandboxes across iterations to avoid cold start costs
Set execution timeouts — Cap sandbox runtime at 30s to prevent infinite loops
Generate tests first — Let the agent write tests before code to reduce iteration cycles
Track iteration count — Use AgentBurn's metadata field to log how many attempts each task takes

Tracking E2B Sandbox Costs for Code Generation Agents

The E2B Cost Model

Instrumenting E2B Costs

Illustrative Cost Split

Optimization Strategies

Start tracking your AI agent costs

Related Articles

From $10K to $3K: A Playbook for Cutting Agent Costs 70%

Cost-Optimizing RAG Pipelines: Embedding vs Inference Spend

Monitoring Multi-Agent Workflows: A CrewAI Cost Breakdown