A developer tracked every token Claude Code consumed for a month. The result: 99.4% were input tokens. For every 1 token written, 166 were consumed reading. Here's what that means for your bill.

A developer recently published the results of tracking their Claude Code usage for an entire month. The numbers are staggering:
This isn’t an outlier. Independent measurements across multiple codebases tell the same story. A study tracking 42 executions across FastAPI’s source code (~800 Python files) found that 70% of all tokens were waste — consumed by file reading and navigation, not reasoning or code generation.
Another benchmark found that 76% of tokens were consumed specifically by file read operations. A separate measurement on ripgrep’s codebase showed that a single code investigation consumed 20,580 tokens across 5+ tool calls — and 87% of that could have been eliminated with structural understanding.
Here’s the breakdown of a typical Claude Code session on a real codebase:
Context window allocation (200K tokens):
You’re paying for 200K tokens of context. Your AI gets to think with 10K–30K of them. The rest is overhead, file reading, and reserved buffers.
At Opus 4.6 pricing ($15/M input tokens), let’s calculate:
At Sonnet 4.6 pricing ($3/M input tokens), it’s cheaper but the same ratio:
And this calculation assumes each query is independent. In practice, context accumulates across turns, so later queries in a session cost far more than earlier ones.
The cost per prompt grows exponentially, not linearly, within a session. Here’s why:
By turn 15, each prompt is re-processing the full conversation history plus all accumulated codebase reads. The last message in a session costs 3–5× more than the first message.
The most expensive aspect isn’t any single session — it’s the re-reading loop across sessions:
Every morning, every developer opens Claude Code. The agent has zero memory from yesterday. It re-reads the same files it read yesterday. And the day before. And last week.
5 developers on one team = 5× the same re-discovery cost, every day. The same files, the same dependencies, the same architecture — rediscovered from scratch by every developer in every session.
Over a month, a 50-developer team might consume 100 million tokens per developer — and 99.4% of those tokens go to input. Reading. Not writing.
The 165:1 read-to-write ratio isn’t a Claude problem — it’s an architecture problem. Every AI coding agent (Cursor, Copilot, Codex) has the same fundamental issue: no persistent map, so it reads everything from scratch every time. ByteBell’s Smart Context Refresh replaces this brute-force reading with a pre-computed code intelligence graph, reducing token consumption by 50–70% and bringing the effective cost per query from 0.04–0.08 — because the AI queries structured metadata instead of re-reading your entire codebase on every prompt. Learn more at bytebell.ai