5 habits to save tokens on Claude
Compaction, PDF to markdown, edited prompts, Projects. The habits that make your Claude conversations last longer without blowing up your context window.
June 21, 2026 · 6 min read
Why your tokens matter
Every conversation with Claude runs inside a limited context window. The more it fills up, the more tokens it costs, the slower the answers get, and the faster you hit your usage limit. Most people only notice once Claude starts forgetting the start of the exchange or refuses to keep going.
The good news: you don't need to understand the mechanics in detail. A few habits are enough to make your conversations last twice as long, keep Claude sharp, and stop wasting tokens. Here are the 5 I use every day.
The Claude AI Lab is my Skool community where I share my Claude systems and the more advanced modules. Entry is free.
Join the Lab →You don't have to apply all of them at once. Start with the habit that fits how you use Claude (Code, Chat or Cowork), the others will follow naturally.
Compact around 60-65%
In Claude Code, the context window fills up as the exchange goes on. The /compact command summarizes the conversation and frees space without losing the thread. You extend the session instead of having to start over from scratch.
/compact. Claude summarizes what matters and continues with a lighter context./compact keep the code in file X and the decision on the API. Claude prioritizes those in the summary.Turn your PDFs and images into text
A raw PDF or an image sent to Claude costs a lot of tokens: layout, structure, pixels, all of it is bundled in. The same content as markdown (clean text) goes through at a fraction of the tokens, and Claude understands it better. For an image, a precise text description costs a fraction of an image, and you control exactly what Claude "sees".
.md file to Claude instead of the original PDF. Same information, lighter format, far fewer tokens.Edit the original prompt instead of stacking fixes
When an answer is off, the reflex is to keep going: "no, more like this", "add that", "drop this"… Each fix lengthens the conversation, and the whole history is sent back to the model on every turn. The window balloons for nothing. Editing the original prompt restarts cleanly, without dragging the back-and-forth along.
One thread, one topic
Every message sends the entire conversation back to the model. A catch-all thread, where you chain unrelated topics, makes you pay for the accumulated context on every new message, even for a simple question. One thread per topic keeps each exchange light.
Put permanent context in a Project
If you re-paste the same instructions, the same context or the same reference documents into every new conversation, you pay for those tokens twice over each time. A Project stores that context once, and all your conversations tap into it without you repeating it.
claude.ai, create a Project for a client, a product or a recurring workflow.The one habit behind them all
All of these come down to a single idea: only let into the window what's useful, at the right moment. Compact before it overflows, give clean text instead of heavy formats, fix at the source instead of stacking, separate your topics, and store permanent context where it gets reused.
Longer, faster conversations, and far fewer limits hit. You spend more time moving forward, less time restarting Claude from scratch.
Want to go further?
In the Lab, I share my end-to-end Claude workflows, the ones that save me hours every week.
A dedicated session or program, tailored to your tools and use cases.
And day-to-day, I post one reel a day on Instagram: @quentin_iamarketing