5 habits to save tokens on Claude, Quentin IA Marketing

Why your tokens matter

Every conversation with Claude runs inside a limited context window. The more it fills up, the more tokens it costs, the slower the answers get, and the faster you hit your usage limit. Most people only notice once Claude starts forgetting the start of the exchange or refuses to keep going.

The good news: you don't need to understand the mechanics in detail. A few habits are enough to make your conversations last twice as long, keep Claude sharp, and stop wasting tokens. Here are the 5 I use every day.

Claude AI Lab

The Claude AI Lab is my Skool community where I share my Claude systems and the more advanced modules. Entry is free.

Join the Lab →

Keep this in mind

You don't have to apply all of them at once. Start with the habit that fits how you use Claude (Code, Chat or Cowork), the others will follow naturally.

Compact around 60-65%

⌨️ Claude Code → /compact

💡Why it helps

In Claude Code, the context window fills up as the exchange goes on. The /compact command summarizes the conversation and frees space without losing the thread. You extend the session instead of having to start over from scratch.

⚙️How to do it

Keep an eye on the context usage percentage Claude Code displays.

Around 60-65%, type /compact. Claude summarizes what matters and continues with a lighter context.

You carry on with your task without a break: the important details are kept, the noise is cleaned out.

🎯Tip

→

Don't wait until 90%. A last-minute compaction risks cutting context that's still useful. 60-65% is the sweet spot: early enough to stay clean, late enough not to compact for nothing.

→

You can spell out what to keep: /compact keep the code in file X and the decision on the API. Claude prioritizes those in the summary.

Turn your PDFs and images into text

📄 Google Docs + ChatGPT / Gemini

💡Why it helps

A raw PDF or an image sent to Claude costs a lot of tokens: layout, structure, pixels, all of it is bundled in. The same content as markdown (clean text) goes through at a fraction of the tokens, and Claude understands it better. For an image, a precise text description costs a fraction of an image, and you control exactly what Claude "sees".

⚙️The PDF as markdown

Open your PDF in Google Docs (it converts it into an editable text document).

Go to File → Download → Markdown (.md).

Give the .md file to Claude instead of the original PDF. Same information, lighter format, far fewer tokens.

🖼️The image as a description

Run your image through an image-to-text model (ChatGPT or Gemini) and ask for a precise description of what matters to you.

Take that text description and give it to Claude. You pay for text, not an image, and you control what it keeps.

🎯When to use it

📚

Long documents

A report, a contract, a 30-page guide: markdown keeps you from saturating the window on the very first message.

📊

Screenshots and charts

A graph or a screenshot: the text description targets the useful info instead of making Claude guess from pixels.

Edit the original prompt instead of stacking fixes

✏️ Claude Chat / Cowork

💡Why it helps

When an answer is off, the reflex is to keep going: "no, more like this", "add that", "drop this"… Each fix lengthens the conversation, and the whole history is sent back to the model on every turn. The window balloons for nothing. Editing the original prompt restarts cleanly, without dragging the back-and-forth along.

⚙️How to do it

Go back to the original prompt (the one that started the exchange you want to fix).

Click to edit it and add the missing information that would have avoided the mistake (the context, the constraint, the expected format).

Send it again. Claude regenerates a clean answer from that point, without the pile of corrective prompts.

🎯Worth remembering

The right habit

One well-edited original prompt beats five corrective prompts. You save on tokens and on answer quality.

One thread, one topic

🧵 Claude Chat / Cowork

💡Why it helps

Every message sends the entire conversation back to the model. A catch-all thread, where you chain unrelated topics, makes you pay for the accumulated context on every new message, even for a simple question. One thread per topic keeps each exchange light.

⚙️How to do it

New topic unrelated to the current exchange? Open a new conversation instead of continuing in the current thread.

Keep a dedicated thread per project, client or task. Once an exchange has done its job, start a fresh one for what's next.

🎯Bonus

→

Less context to process means faster answers too. A short thread stays sharp from start to finish.

Put permanent context in a Project

📁 claude.ai → Projects

💡Why it helps

If you re-paste the same instructions, the same context or the same reference documents into every new conversation, you pay for those tokens twice over each time. A Project stores that context once, and all your conversations tap into it without you repeating it.

⚙️How to do it

In claude.ai, create a Project for a client, a product or a recurring workflow.

Put your instructions and reference documents in the project knowledge once (style guidelines, business context, key docs).

Every conversation started inside that Project inherits the context automatically. On Cowork, same logic: keep the stable context in the workspace instead of rewriting it every time.

🎯When it changes everything

🧑‍💼

A recurring client

Brief, brand tone, history: set once, reused in every conversation without copying it over.

🔁

A repeated task

Writing posts, support replies, analyses: the frame lives in the Project, you stop describing the setup on every message.

The one habit behind them all

All of these come down to a single idea: only let into the window what's useful, at the right moment. Compact before it overflows, give clean text instead of heavy formats, fix at the source instead of stacking, separate your topics, and store permanent context where it gets reused.

The payoff

Longer, faster conversations, and far fewer limits hit. You spend more time moving forward, less time restarting Claude from scratch.

Why your tokens matter

Compact around 60-65%

Turn your PDFs and images into text

Edit the original prompt instead of stacking fixes

One thread, one topic

Put permanent context in a Project

The one habit behind them all

Want to go further?