Skip to content

1-hour cache writes priced at 5-minute rate #276

@marcin-kurdziel

Description

@marcin-kurdziel

Bug

CodeBurn applies the 5-minute cache-write rate to all cache writes, including 1-hour ones. Anthropic charges:

  • 5m cache write: 1.25× input (e.g. $6.25/MTok on Opus 4.7)
  • 1h cache write: 2× input (e.g. $10.00/MTok on Opus 4.7)

Source: https://docs.anthropic.com/en/docs/about-claude/pricing

Root cause is that FALLBACK_PRICING (and the LiteLLM fields it mirrors) only carries one cache_creation_input_token_cost, which corresponds to the 5m rate.

Real-world impact on a Claude Code Opus 4.7 day: ~14% under-report. Plan-mode and long agent sessions use 1h cache heavily, so the gap scales with that share of the workload.

Data is already there

Claude Code session JSONL splits the two:

"usage": {
  "cache_creation_input_tokens": 60120,
  "cache_creation": {
    "ephemeral_5m_input_tokens": 0,
    "ephemeral_1h_input_tokens": 60120
  }
}

So no new data source needed — only parser + pricer change.

Suggested fix

Read usage.cache_creation.ephemeral_5m_input_tokens and usage.cache_creation.ephemeral_1h_input_tokens separately. Price the 5m portion at the existing rate; price the 1h portion at 1.6 × cache_creation_input_token_cost (since 2.0 / 1.25 = 1.6, constant across current Anthropic models). Fall back to current behavior when only the legacy cache_creation_input_tokens field is present.

A parallel issue against BerriAI/litellm to add an explicit cache_creation_input_1h_token_cost field would let downstream consumers drop the hardcoded 1.6×

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions