Ruflo: a team of AI sub-agents inside Claude Code (and tokens that last longer)
The beginner-to-intermediate guide: install Ruflo in Claude Code tonight, put a team of sub-agents to work in parallel, and route the right model to the right job to save tokens.
June 3, 2026 · 8 min read
Most people use Claude Code as a single assistant: you hand it a task, it does the whole thing alone, start to finish. Ruflo changes that. Ruflo turns Claude Code into a team lead: instead of one assistant, you direct a team of specialized sub-agents that split the work and run in parallel.
If you have seen the multi-agent mode that breaks a big task across several sub-agents, Ruflo plays in the same league. The difference: it is open-source, ready out of the box, and you can put a different model on each agent. A heavy model to reason, a cheap lightweight one for the repetitive grind. That is where you save tokens.
A word of honesty before we start: Ruflo is a large project (built by ruvnet, the author of claude-flow, of which Ruflo is the successor). It does a lot of things. We will not cover all of them. This guide shows you the simplest entry path to install it in Claude Code and get it running tonight, and I plain-language every technical term as we go.
The Claude AI Lab is my Skool community where I share my Claude systems and the more advanced modules. Entry is free.
Join the Lab →npx command used for the install. A recent version is enough.There are two ways to install Ruflo: a lite version to try it without breaking anything, and a full version that unlocks the real agent team. We cover both, in that order.
Understand Ruflo
Ruflo is a layer you drop on top of Claude Code. In plain terms, it adds three things:
The full version ships ready to use with about 98 agents, more than 60 commands and 30 skills, plus an MCP server. An MCP server is simply the plug that connects extra tools to Claude (memory, agent coordination, and so on).
You do not need to learn Ruflo's hundreds of tools to get going. Once installed, you keep using Claude Code normally: the system routes tasks and coordinates agents on its own, in the background.
Install the lite version
This is the way to taste Ruflo without installing anything in your project. This version only adds commands and agent definitions. Zero files written on your side, and you can remove it in one click.
/plugin marketplace add ruvnet/ruflo. A marketplace is just the catalog you install plugins from./plugin install ruflo-federation@ruflo (or another plugin from the catalog, for example ruflo-cost-tracker@ruflo to track your tokens)./ in Claude Code: the new Ruflo commands show up in the list. Run one to see it work.In the lite version, the swarm tools (create a swarm, spawn an agent, write to memory) are not wired in: the MCP server is not installed. It is perfect to discover the commands, but the real multi-agent loop needs the full version, right below.
Install the full version
This is the one that unlocks the whole agent team. It installs with a single command, through a wizard that asks you the right questions.
npx ruflo@latest init wizard and follow the questions. The wizard runs the same way on Mac, Windows and Linux..claude/, .claude-flow/, a CLAUDE.md file). That is normal.The full version writes files and a CLAUDE.md into your folder. For the first run, always do it in a test folder, not in your real work project. You will move it over once you are comfortable.
Save your tokens
This is the part that talks to your wallet. Ruflo does not lock you into a single model. It offers six ready to use through OpenRouter (a gateway that gives access to many models with one key):
The savings lever is simple: you do not have to pay for a premium model on every sub-task. You put a light model on the simple work and keep the heavy model only where you truly need to think. On a mission split into ten sub-tasks, that reshapes the bill.
Install the ruflo-cost-tracker plugin (through the lite or the full version): it tracks your token usage, lets you set a budget, and alerts you when you approach the limit. You finally see where the money goes.
A concrete example: your first multi-agent mission tonight
The fastest way to feel the value tonight: install the full version in a test folder, then hand Claude Code a mission in several pieces. You will watch it split the work across several agents instead of doing everything single file.
ruflo-test, on your desktop.npx ruflo@latest init wizard and pick a simple profile when the wizard asks.ruflo-cost-tracker, open it to see how many tokens each step cost.The mission to paste (customize the brackets):
In this project, build [WHATEVER YOU WANT, e.g. a small to-do list API].
Break the work into sub-tasks and put several agents to work:
- one agent that writes the code
- one agent that writes the tests
- one agent that reviews and fixes
Put a light, cheap model on the simple tasks
(formatting, tests, documentation) and keep a heavy model
for the architecture and the hard decisions.
At the end, give me a summary: what was done, by which agent,
and how many tokens it cost.
That is it. Instead of one assistant working task after task, you just ran a small team, each on its own piece, with the right model in the right place. The skeleton does not change from one mission to the next: you only swap what you want to build, and you reuse the same logic on your real project once you are comfortable.
When you are ready to move to the real thing, keep the test-folder reflex for every new project, add only the plugins you need (security, observability, cost tracking), and let the hooks coordinate. You scale up without ever losing control of your bill.
Want to go further?
In the Lab, I share my Claude agent and sub-agent setups, and how to run them in parallel without breaking things.
A dedicated session or program, tailored to your tools and use cases.
And day-to-day, I post one reel a day on Instagram: @quentin_iamarketing