Claude · Agents

Ruflo: a team of AI sub-agents inside Claude Code (and tokens that last longer)

The beginner-to-intermediate guide: install Ruflo in Claude Code tonight, put a team of sub-agents to work in parallel, and route the right model to the right job to save tokens.

QQuentin Megevand
June 3, 2026 · 8 min read

Most people use Claude Code as a single assistant: you hand it a task, it does the whole thing alone, start to finish. Ruflo changes that. Ruflo turns Claude Code into a team lead: instead of one assistant, you direct a team of specialized sub-agents that split the work and run in parallel.

If you have seen the multi-agent mode that breaks a big task across several sub-agents, Ruflo plays in the same league. The difference: it is open-source, ready out of the box, and you can put a different model on each agent. A heavy model to reason, a cheap lightweight one for the repetitive grind. That is where you save tokens.

A word of honesty before we start: Ruflo is a large project (built by ruvnet, the author of claude-flow, of which Ruflo is the successor). It does a lot of things. We will not cover all of them. This guide shows you the simplest entry path to install it in Claude Code and get it running tonight, and I plain-language every technical term as we go.

Claude AI Lab

The Claude AI Lab is my Skool community where I share my Claude systems and the more advanced modules. Entry is free.

Join the Lab →
What you need before you start
1
Claude Code installed. This is the version of Claude that acts in your terminal and your projects, not the plain chat. If you already use it, you are set.
2
Node.js installed. It is the engine that runs the npx command used for the install. A recent version is enough.
3
A test folder. Ruflo writes config files into your project. For the first run, start from a sandbox folder, not your real project.
4
(Optional) An OpenRouter key. Only if you want to route models other than Claude (Gemini, Qwen, a local model). Not required to get started.
The thing to grasp right away

There are two ways to install Ruflo: a lite version to try it without breaking anything, and a full version that unlocks the real agent team. We cover both, in that order.

1

Understand Ruflo

🔗 github.com/ruvnet/ruflo

Ruflo is a layer you drop on top of Claude Code. In plain terms, it adds three things:

🤖
A team of agents
Specialized sub-agents (one to code, one to test, one to review) that coordinate instead of each working in its own corner. This is called a swarm.
💾
Memory that lasts
What the agents learn and decide stays saved from one session to the next, instead of starting from scratch every time.
🪝
Hooks that dispatch
A hook is an automatic trigger. Here, they route tasks to the right agents in the background, so you do not have to orchestrate it yourself.

The full version ships ready to use with about 98 agents, more than 60 commands and 30 skills, plus an MCP server. An MCP server is simply the plug that connects extra tools to Claude (memory, agent coordination, and so on).

Good to know

You do not need to learn Ruflo's hundreds of tools to get going. Once installed, you keep using Claude Code normally: the system routes tasks and coordinates agents on its own, in the background.

2

Install the lite version

📍 in Claude Code: /plugin

This is the way to taste Ruflo without installing anything in your project. This version only adds commands and agent definitions. Zero files written on your side, and you can remove it in one click.

🔌Three steps, inside Claude Code
5
Add the marketplace. In Claude Code, type /plugin marketplace add ruvnet/ruflo. A marketplace is just the catalog you install plugins from.
6
Install a plugin. Type /plugin install ruflo-federation@ruflo (or another plugin from the catalog, for example ruflo-cost-tracker@ruflo to track your tokens).
7
Test it. Type / in Claude Code: the new Ruflo commands show up in the list. Run one to see it work.
The limit to know

In the lite version, the swarm tools (create a swarm, spawn an agent, write to memory) are not wired in: the MCP server is not installed. It is perfect to discover the commands, but the real multi-agent loop needs the full version, right below.

3

Install the full version

📍 terminal: npx ruflo@latest init wizard

This is the one that unlocks the whole agent team. It installs with a single command, through a wizard that asks you the right questions.

⚙️The full install, step by step
8
Move into your test folder. Open a terminal in an empty folder or a sandbox project. That is where Ruflo will write its configuration.
9
Run the wizard. Type npx ruflo@latest init wizard and follow the questions. The wizard runs the same way on Mac, Windows and Linux.
10
Let it install. It sets up the full loop: the agents, the commands, the skills, the MCP server and the hooks. It also creates a few files in your folder (.claude/, .claude-flow/, a CLAUDE.md file). That is normal.
11
Open Claude Code in that folder. The MCP server and the hooks are now active. You use Claude Code as usual, and Ruflo coordinates the agents behind the scenes.
Safety rule you should not skip

The full version writes files and a CLAUDE.md into your folder. For the first run, always do it in a test folder, not in your real work project. You will move it over once you are comfortable.

4

Save your tokens

📍 the right model in the right place

This is the part that talks to your wallet. Ruflo does not lock you into a single model. It offers six ready to use through OpenRouter (a gateway that gives access to many models with one key):

🧠
Heavy models
Claude Sonnet 4.6, Gemini 2.5 Pro, or an OpenAI model: for reasoning, architecture, the hard decisions.
Light models
Claude Haiku 4.5, Gemini 2.5 Flash, Qwen 3.6 Max: fast and cheap, for the repetitive work (formatting, tests, docs).
🏠
Local models
Your own models through Ollama or LM Studio, running on your machine, with no per-token cost.

The savings lever is simple: you do not have to pay for a premium model on every sub-task. You put a light model on the simple work and keep the heavy model only where you truly need to think. On a mission split into ten sub-tasks, that reshapes the bill.

The tool that goes with it

Install the ruflo-cost-tracker plugin (through the lite or the full version): it tracks your token usage, lets you set a budget, and alerts you when you approach the limit. You finally see where the money goes.

A concrete example: your first multi-agent mission tonight

The fastest way to feel the value tonight: install the full version in a test folder, then hand Claude Code a mission in several pieces. You will watch it split the work across several agents instead of doing everything single file.

🎯The mission, set up in 5 steps
12
Create an empty test folder. For example ruflo-test, on your desktop.
13
Install the full version. In that folder, run npx ruflo@latest init wizard and pick a simple profile when the wizard asks.
14
Open Claude Code in that folder. The agents and hooks are ready.
15
Give it the mission below. Customize the brackets with whatever you want to build.
16
Watch the result. Claude splits the work across its agents. If you installed ruflo-cost-tracker, open it to see how many tokens each step cost.

The mission to paste (customize the brackets):

In this project, build [WHATEVER YOU WANT, e.g. a small to-do list API].

Break the work into sub-tasks and put several agents to work:
- one agent that writes the code
- one agent that writes the tests
- one agent that reviews and fixes

Put a light, cheap model on the simple tasks
(formatting, tests, documentation) and keep a heavy model
for the architecture and the hard decisions.

At the end, give me a summary: what was done, by which agent,
and how many tokens it cost.

That is it. Instead of one assistant working task after task, you just ran a small team, each on its own piece, with the right model in the right place. The skeleton does not change from one mission to the next: you only swap what you want to build, and you reuse the same logic on your real project once you are comfortable.

Going further

When you are ready to move to the real thing, keep the test-folder reflex for every new project, add only the plugins you need (security, observability, cost tracking), and let the hooks coordinate. You scale up without ever losing control of your bill.

Want to go further?

And day-to-day, I post one reel a day on Instagram: @quentin_iamarketing