For every 1 token your AI writes, it reads 166. That 165:1 ratio explains why AI coding is expensive, slow, and hitting limits constantly. Here's the data.

When most people think about AI coding costs, they think about the output — the code the AI generates. But the real cost isn’t in writing code. It’s in reading code.
Across 100 million tracked tokens of Claude Code usage, the ratio was clear: 99.4% input tokens, 0.6% output tokens. For every single token of code or explanation the AI generated, it consumed 166 tokens reading files, processing commands, and navigating the codebase.
A 165:1 read-to-write ratio.
This isn’t unique to Claude. Every AI coding agent — Cursor, Copilot, Codex, Cline — uses the same fundamental approach: read files from the filesystem to build understanding, then generate output. The read-to-write ratio varies by tool and codebase size, but independent measurements consistently land in the same range.
AI coding agents don’t have a map. They don’t have an index. They start every session with zero understanding of your codebase.
To answer a question like “How does the payment validation flow work?”, the agent must:
Steps 1–5 consume tokens. Step 6 generates tokens. The ratio reflects that understanding requires far more reading than writing.
On a typical codebase of 200–500 files, a single complex question involves 40–60 search/read operations. Each operation consumes thousands of tokens. The reading dwarfs the writing by two orders of magnitude.
At current pricing (Sonnet 4.6: 15/M output tokens):
If your agent consumes 100K input tokens and 600 output tokens per query:
At Opus 4.6 pricing (75/M output):
You’re paying for reading, not writing. The AI’s intelligence — its ability to reason and generate code — accounts for roughly 3% of the cost.
Your Claude Pro or Max subscription allocates a token budget per 5-hour session. When 97% of each request is input tokens from file reading, your budget is consumed overwhelmingly by navigation — not by the AI doing useful work.
This is why 6 messages can exhaust a 5-hour session. Each “message” is 25K–100K input tokens of file reading, plus a few hundred tokens of your actual question, plus a few thousand tokens of the AI’s response. The file reading eats the budget.
The token waste isn’t just a cost problem — it’s a quality problem. When 80% of the context window is consumed by raw file contents, the AI has only 5–15% of its capacity available for actual reasoning.
Research shows that AI models perform worse when surrounded by irrelevant information. More context isn’t just expensive — it actively degrades the quality of the model’s output. The AI looks confident, but it’s increasingly working with diluted attention.
The 165:1 ratio is a direct consequence of brute-force file reading. If the AI could get codebase understanding from structured metadata instead of raw file contents, the input token count would drop by 90%+ while the output quality would improve because the remaining context is higher-signal information. ByteBell’s Smart Context Refresh delivers exactly this — pre-computed graph metadata that gives your AI the same understanding at 3–5% of the token cost, flipping the economics from “97% wasted on reading” to “95% available for thinking.” Learn more at bytebell.ai