I Tracked 100 Million Tokens of Claude Code Usage. 99.4% Were Wasted on Reading.

A developer tracked every token Claude Code consumed for a month. The result: 99.4% were input tokens. For every 1 token written, 166 were consumed reading. Here's what that means for your bill.

I Tracked 100 Million Tokens of Claude Code Usage. 99.4% Were Wasted on Reading.
Photo by Alexandar Todov on Unsplash

I Tracked 100 Million Tokens of Claude Code Usage. 99.4% Were Wasted on Reading.

A developer recently published the results of tracking their Claude Code usage for an entire month. The numbers are staggering:

This isn’t an outlier. Independent measurements across multiple codebases tell the same story. A study tracking 42 executions across FastAPI’s source code (~800 Python files) found that 70% of all tokens were waste — consumed by file reading and navigation, not reasoning or code generation.

Another benchmark found that 76% of tokens were consumed specifically by file read operations. A separate measurement on ripgrep’s codebase showed that a single code investigation consumed 20,580 tokens across 5+ tool calls — and 87% of that could have been eliminated with structural understanding.

Where Your Tokens Actually Go

Here’s the breakdown of a typical Claude Code session on a real codebase:

Context window allocation (200K tokens):

You’re paying for 200K tokens of context. Your AI gets to think with 10K–30K of them. The rest is overhead, file reading, and reserved buffers.

The Cost Math

At Opus 4.6 pricing ($15/M input tokens), let’s calculate:

At Sonnet 4.6 pricing ($3/M input tokens), it’s cheaper but the same ratio:

And this calculation assumes each query is independent. In practice, context accumulates across turns, so later queries in a session cost far more than earlier ones.

Why It Gets Worse Over Time

The cost per prompt grows exponentially, not linearly, within a session. Here’s why:

By turn 15, each prompt is re-processing the full conversation history plus all accumulated codebase reads. The last message in a session costs 3–5× more than the first message.

The Re-Reading Loop

The most expensive aspect isn’t any single session — it’s the re-reading loop across sessions:

Every morning, every developer opens Claude Code. The agent has zero memory from yesterday. It re-reads the same files it read yesterday. And the day before. And last week.

5 developers on one team = 5× the same re-discovery cost, every day. The same files, the same dependencies, the same architecture — rediscovered from scratch by every developer in every session.

Over a month, a 50-developer team might consume 100 million tokens per developer — and 99.4% of those tokens go to input. Reading. Not writing.

The Structural Fix

The 165:1 read-to-write ratio isn’t a Claude problem — it’s an architecture problem. Every AI coding agent (Cursor, Copilot, Codex) has the same fundamental issue: no persistent map, so it reads everything from scratch every time. ByteBell’s Smart Context Refresh replaces this brute-force reading with a pre-computed code intelligence graph, reducing token consumption by 50–70% and bringing the effective cost per query from 230downto2–30 down to0.04–0.08 — because the AI queries structured metadata instead of re-reading your entire codebase on every prompt. Learn more at bytebell.ai