When Claude Code shows 'context left until auto-compact: 0%', it's about to summarize everything and throw away the details. Here's exactly what gets lost and why it matters.

You’re 45 minutes into debugging a payment webhook. Claude has the stack trace, the relevant files, the narrowed hypothesis. Then you see the warning:
“Context left until auto-compact: 0%”
Seconds later, Claude summarizes everything, forgets which files it modified, and starts re-reading code it already analyzed. Your debugging session just got reset to zero.
If you’ve experienced this, you know the feeling. The agent was on a roll — precise, fast, hitting the right files. After compaction, it becomes vague, refers to things in general terms, and asks to re-read files it spent 20 minutes analyzing.
This article explains exactly what happens during auto-compaction, what information gets destroyed, and why the death spiral that follows makes it progressively worse.
Claude Code monitors your context window continuously. The context window is the total amount of information the model can hold in active memory at any time — currently 200K tokens for most plans, or up to 1M tokens on the latest models.
Auto-compaction fires when usage hits approximately 83% of the window — roughly 167,000 tokens on a 200K window. This threshold was raised from ~77% in mid-2025, giving users about 12,000 additional tokens per cycle.
Everything in the context window contributes to this limit: your messages, Claude’s responses, every file that was read, every grep output, every command result, every tool schema, and the system prompt. It all accumulates. Nothing evaporates between turns.
The typical session fills the context window in this order:
Before you type anything (~20K–40K tokens consumed):
Your first question adds ~200 tokens. Trivial.
Claude’s investigation adds 40K–100K tokens:
Your follow-up question and Claude’s second investigation: Everything from round 1 is still in context. Round 2 stacks on top. By turn 10–15, the window is full.
On a large codebase with many files, the window can fill in 15–20 minutes.
When compaction fires, Claude performs the following:
The compression ratio is brutal: 167,000 tokens compressed into 2,000–4,000 tokens. That’s a 40:1 to 80:1 compression ratio.
File paths and line numbers:
Before compaction: Error at src/api/webhooks/stripe.ts:98 — TypeError: Cannot read property 'retryCount' of undefined
After compaction: “Found bug in Stripe webhook retry logic”
The agent no longer knows the file path, the line number, or the exact error.
Exact error messages and test results:
Before: FAIL test/webhooks.test.ts:156 — Expected: { status: 'retry', count: 3 } — Received: TypeError
After: “Test at webhooks test file is failing”
Debugging hypothesis chains:
Before: “Hypothesis 1: retryCount undefined because customer is new → CONFIRMED. Hypothesis 2: metadata field missing → REJECTED. Hypothesis 3: Schema migration missed default → INVESTIGATING. Checked migration at db/migrations/20240115_add_retry.sql”
After: “Investigated retry count bug. Found it’s related to missing defaults.”
Multi-file relationship maps:
Before: A detailed trace through 5 files showing exact request flow with line numbers
After: “Traced auth flow through gateway, auth service, and billing service”
After compaction, the agent needs to continue working. But it no longer has the specific details. So it re-reads the files it already read — consuming tokens, filling the context window again. Within 15–20 minutes, compaction fires a second time.
Compaction 1: Full conversation → summary. General picture preserved, specifics lost.
Compaction 2: Summary + new work → summary of a summary. The agent is now working from a summary that was already missing details, and that summary gets compressed again.
Compaction 3: Summary of summary + new work → triple-compressed summary. By now, early decisions, exact file paths, and specific error messages from the beginning of the session are essentially gone.
Compaction 4+: The agent is generating code based on fragments of fragments. It may contradict its own earlier decisions, re-introduce bugs it already fixed, or attempt edits in locations that no longer match reality.
Research confirms this: quality degradation begins at just 70% context utilization — well before compaction fires. After 3–4 compaction cycles, critical context is permanently lost. 65% of enterprise AI failures in 2025 were attributed to this exact pattern of context drift and memory loss.
Run /compact manually with instructions before auto-compact fires: /compact preserve the file paths I modified, current test failures, and the auth refactoring plan. This produces a better summary than the automatic one.
Commit and clear at logical stopping points. Finish a subtask, commit your changes, run /clear to start a fresh session.
Start new sessions for new tasks. Don’t chain tasks in one conversation.
Manual compaction is a band-aid on a fundamentally broken loop: read files → fill context → compact → lose info → re-read → fill → compact. The problem isn’t that compaction is bad at summarizing — it’s that 60–80% of your context window is filled with raw file contents that shouldn’t be there in the first place. ByteBell’s Smart Context Refresh eliminates the compaction death spiral entirely by replacing brute-force file reading with pre-computed graph metadata that uses 3–5% of the context window — keeping your agent well below the compaction threshold for the entire session, with zero information loss, and the graph persists between sessions so your AI never re-reads your codebase from scratch again. Learn more at bytebell.ai