axonaxon
home/docs

Documentation

Everything you need to install, onboard, and run axon — including running it for free with GitHub Models.

Introduction

axon is a multi-agent CLI coding agent. You run it in a terminal, it works inside a single project directory, and it makes real changes to your real filesystem.

What sets it apart is the delegation model: a single orchestrator at the top decomposes hard tasks into focused sub-agents ("roles"), dispatches them, and synthesizes their outputs. Almost every aspect of that pipeline — depth, fan-out caps, roles, models — is configurable per project via plain files in .axon/.

These docs cover everything: how the agent runs, every config file, every slash command, every tool, and how to onboard for free using GitHub Models.

Install & quick start

axon is published to npm as axon-cli. Install globally:

$ npm install -g axon-cli

Then run it inside any project:

bash
cd ~/your-project
axon

Requirements

  • Node.js 18 or newer
  • npm (or pnpm / yarn — installs the same global binary)
  • A modern terminal with truecolor support recommended

Onboarding wizard

On first run you'll get an onboarding wizard. It asks for five things:

  • Provider — Anthropic or OpenAI-compatible (works with OpenAI, OpenRouter, Groq, Together, GitHub Models, Ollama, LM Studio — anything that speaks the OpenAI API).
  • Model — e.g. claude-opus-4-7, gpt-4o, llama-3.3-70b-versatile.
  • API key — stored at ~/.axon/config.json with mode 0600.
  • Endpoint — OpenAI-compatible only. Base URL of the API.
  • Serper key — optional. Gates the web_search tool.

You only do this once. Re-run with axon --setup (or /setup from inside the TUI) to change provider / model / keys without restarting.

CLI flags

bash
-r, --resume <id>       resume a saved session by id
-l, --last              resume the most recent session in this workspace
-w, --workspace <path>  use a different workspace root (defaults to cwd)
    --setup             force re-running BYOK onboarding

Free setup

You don't need to pay for an LLM provider or a search API to run axon. Both have generous free tiers — here's how to wire them up.

No API key? Use GitHub Models (free)

GitHub Models gives free, rate-limited access to frontier models (GPT-4o, Llama 3.3, Mistral, Phi, etc.) using a GitHub Personal Access Token. axon talks to it through the OpenAI-compatible flow — no provider plugin needed.

Step 1 — Create a GitHub Personal Access Token (classic).

  • Open github.com/settings/tokens.
  • Click Generate new token Generate new token (classic).
  • Give it a name like axon-models and an expiry that suits you.
  • Leave all scopes unchecked — Models access does not require any repo scope.
  • Click Generate token and copy the ghp_… value. You will not see it again.

Step 2 — Run axon and pick OpenAI-compatible.

$ axon --setup

Answer the wizard like this:

text
? Provider   ›  OpenAI-compatible
? Endpoint   ›  https://models.github.ai/inference
? Model      ›  openai/gpt-4o
? API key    ›  ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxx

Useful model names (any model listed in the GitHub Models catalog will work):

text
openai/gpt-4o
openai/gpt-4o-mini
meta/Meta-Llama-3.1-70B-Instruct
meta/Meta-Llama-3.1-405B-Instruct
mistral-ai/Mistral-large-2407
microsoft/Phi-3.5-MoE-instruct

If the newer endpoint rejects your token — some accounts are still on the older endpoint. Re-run axon --setup and try this URL instead:

text
Endpoint  ›  https://models.inference.ai.azure.com

Heads-up on rate limits. GitHub Models is rate-limited per minute and per day, with smaller limits on larger models. axon will surface 429s as a transient error — wait a minute and continue. For long sessions, a paid provider is more comfortable; for evaluation and light use, the free tier is plenty.

Free Serper API key for web search

The web_search tool calls Google through serper.dev. Serper gives 2,500 free queries on sign-up — no credit card.

  • Go to serper.dev and sign up (Google or email).
  • Open the API Key tab in the dashboard and copy the key.
  • Add it to axon — either during onboarding, or later by running /setup inside the TUI.

Without a Serper key, the web_search tool is disabled cleanly — every other tool still works. web_fetch (fetch a known URL) does not require Serper.

Orchestration model

axon is built around one idea: a hard task is easier to solve when one agent plans it and many specialist agents execute it. The default shape is one orchestrator (a heavy model) calling a flat layer of leaf specialists.

text
                user prompt
                     │
                     ▼
            ┌────────────────────┐
            │    Orchestrator    │   plans + synthesizes
            └────────┬───────────┘
                     │ call_agent({ role, prompt })
       ┌─────────────┼─────────────┐
       ▼             ▼             ▼
  researcher    implementer    reviewer       ← leaves (depth = 1)
  (read-only)   (writes code)  (critique)

The default flow

  • Evaluate — simple or complex?
  • Solve simple tasks directly.A one-file edit is the orchestrator's own job — spawning a sub-agent costs more than it saves.
  • Plan complex tasks first. List independent slices and what each produces.
  • Fan out. One call_agent per slice.
  • Synthesize. Combine the compact reports into the answer the user asked for.
  • Iterate. If wave 1 surfaces gaps, run a second wave informed by what wave 1 found.

Three layers of budget

  • maxDepth (default 1) — hard cap on delegation depth.
  • softCap(default 10) — advisory budget surfaced in the orchestrator's system prompt.
  • hardCap (default 25) — silent runaway-loop rail enforced in code.

Plus two optional finer-grained caps off by default: maxCallsPerAgent (per-parent fan-out) and maxCallsAtDepth (per-depth fan-out).

Architecture config

Project-scoped agent shape lives in .axon/architecture.json. Every field is optional; missing or invalid file → defaults.

.axon/architecture.json·json
{
  "v": 1,
  "maxDepth": 1,
  "softCap": 10,
  "hardCap": 25,
  "maxCallsPerAgent": null,
  "maxCallsAtDepth": null,
  "subAgentModel": null,
  "requireRole": false
}

Commands

text
/architecture          print active config + file path
/architecture init     write a starter file with every knob present
/architecture reload   re-read the file after editing
/architecture path     print the file path

Alias: /arch works everywhere /architecture does.

Cost-optimized example

Keep the orchestrator on a strong model, run leaves on something cheap and fast:

.axon/architecture.json·json
{
  "v": 1,
  "subAgentModel": "claude-haiku-4-5-20251001"
}

Strict-delegation example

.axon/architecture.json·json
{
  "v": 1,
  "maxDepth": 2,
  "softCap": 8,
  "hardCap": 30,
  "maxCallsPerAgent": 4,
  "maxCallsAtDepth": { "1": 8, "2": 6 },
  "requireRole": true
}

Roles

A role is a pre-baked system prompt plus a one-line description, identified by a lowercase name. The orchestrator calls a sub-agent by role:

ts
call_agent({ role: "researcher", prompt: "..." })

The five built-in roles

  • researcher — explore the codebase, gather context, report findings. Read-only.
  • implementer — write code for a specific scope. Writes.
  • reviewer — critique code (correctness, conventions, cross-file consistency). Read-only.
  • tester — write/run tests for a scope. Writes.
  • debugger — reproduce a failure, trace, identify root cause. Small probes only.

Override or add a role

Drop a markdown file into .axon/roles/<name>.md:

.axon/roles/security-auditor.md·md
---
description: pentest the changed code
---
You are a SECURITY-AUDITOR specialist. You look for OWASP top-10
issues in the code the orchestrator hands you.

Rules:
- Read-only. Do NOT modify files.
- Focus on the scope the orchestrator gave you.
- Every issue must cite file:line and an OWASP category.

Return format (≤25 lines, no narration):
- Issues: `file:line · category · description`
- Severity summary: blocker / high / med / low counts.

Role commands

text
/roles                  list all registered roles
/roles show <name>      print one role's full prompt
/roles init             scaffold the five built-in role prompts
/roles reload           re-read the directory after editing
/roles path             print the .axon/roles/ directory path

Slash commands

Every TUI affordance is one slash away.

text
/help                       in-TUI command list
/clear                      reset context in this session
/sessions                   list saved sessions
/resume <id>                resume a saved session
/forget <id>                delete a saved session
/plan [on|off]              toggle plan mode
/permissions ...            list / allow / deny / remove / reset
/commands                   list custom slash commands
/reload                     re-read .axon/commands/
/config                     print BYOK config summary
/setup                      re-run BYOK onboarding inline
/architecture ...           show / init / reload architecture
/roles ...                  list / show / init / reload roles
/export <md|json> [path]    export the session transcript
/exit                       quit

Custom slash commands

Drop a markdown file in .axon/commands/ and it becomes a slash command:

.axon/commands/review.md·md
---
description: Review the working tree
args: focus area (optional)
---
You're reviewing the current working tree. $ARGS

Focus on: $1
Workspace: $WORKSPACE
  • Filename → command name (./review.md /review).
  • Substitutions: $ARGS, $1..$9, $WORKSPACE.
  • Built-in commands take precedence — you can't shadow them.

Permissions

Project-scoped per-tool permissions live in .axon/permissions.json. The TUI's "Always allow" choice writes here.

.axon/permissions.json·json
{
  "v": 1,
  "allowed": ["search", "read_file"],
  "denied": ["run_command"]
}

Commands

text
/permissions                    print allowed + denied
/permissions allow <tool>       add to allowed
/permissions deny <tool>        add to denied
/permissions remove <tool>      remove from both lists
/permissions reset              clear everything

Alias: /perms.

Project memory (AXON.md)

Drop AXON.md (or axon.md) at the workspace root and its contents are appended to every system prompt as standing instructions, for every sub-agent, every turn:

AXON.md·md
This project uses pnpm, not npm.
Tests live in __tests__/ next to the source.
Don't touch src/legacy/ — it's deprecated and frozen.
Run `pnpm tc` for typecheck.

The agent treats this as user-authored context — it's wrapped in a <project_memory> block.

Sessions & export

Every turn is snapshotted to .axon/sessions/<id>.json. Resume, list, or delete:

text
/sessions             list saved sessions
/resume <id>          resume a saved session
/forget <id>          delete a saved session

Or from the shell:

bash
axon --resume <id>    # resume a specific session at launch
axon --last           # resume the most recent session

Export

text
/export md                  write transcript to .axon/exports/<id>.md
/export json                write structured snapshot
/export md ./somewhere.md   custom path

Tools

All tools are workspace-rooted. Every mutating tool is gated by the approval prompt unless allow-listed via /permissions.

Filesystem

  • read_file — read a file; marks the path as observed this session.
  • write_file — write a file. Refuses to overwrite without a prior read this session. Creates parent directories.
  • list_files — list a directory.
  • delete_file — delete a file (sparingly).
  • search — ripgrep-backed regex search with glob filters.

Git (read-only by default)

  • git_status, git_diff, git_blame, git_log — read-only.
  • git_commit, git_branch, git_checkout — gated.

Web

  • web_search — Google via serper.dev (requires Serper key).
  • web_fetch — fetch a URL; HTML stripped to text.

Project-aware

  • run_checks — auto-detect build/test/lint/typecheck command with parsed file:line:col diagnostics.
  • run_command — run any shell command from the workspace root. Last-resort tool.

Meta

  • call_agent — delegate a subtask to a sub-agent.
  • summarize_conversation — produce a compact recap.
  • exit_plan_mode — plan mode only; surface a plan for approval.

Approval flow

  • allow once — this call only.
  • always allow — bypass approval for this tool name for the rest of the session.
  • always allow + project — also persist to .axon/permissions.json.
  • deny + feedback — reject with a reason the agent sees.

For write_file, the menu shows a diff of the proposed change before you decide.

Plan mode

A read-only run mode where the agent can only inspect, not modify. It must finish by calling exit_plan_mode with a markdown plan for your approval. Once you approve, plan mode flips off and execution proceeds under the normal approval flow.

text
/plan          toggle
/plan on       explicit on
/plan off      explicit off

Useful for hard tasks where you want to see the plan before any edits land. Mutating tools auto-deny while plan mode is active.

MCP support

Standard Model Context Protocol servers can be registered per project (.forge/mcp.json) or personally (~/.forge/mcp.json). Workspace wins. The MCP tool surface is merged into the agent's tools automatically.

.forge/mcp.json·json
{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/dir"]
    },
    "github": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-github"],
      "env": { "GITHUB_TOKEN": "ghp_..." }
    }
  }
}

No bundled defaults — populate the file or it just stays empty.

Troubleshooting

Onboarding doesn't accept my key

Re-run axon --setup and double-check the provider matches the key. GitHub Models requires OpenAI-compatible as the provider; do not pick Anthropic for a ghp_… token.

GitHub Models returns 401 / 403

Regenerate the classic token. Ensure it has not been auto-revoked by a push-protection rule (don't paste it into a repo). The newer endpoint is https://models.github.ai/inference; the legacy one is https://models.inference.ai.azure.com — try the other if one rejects.

web_search returns "tool not configured"

Add a Serper API key with /setup. Free tier at serper.dev gives 2,500 queries.

Hit the hard cap mid-task

The agent was over-decomposing. Either raise hardCap in .axon/architecture.json, or simplify the prompt. The default of 25 is intentionally conservative.

Still stuck?

Open an issue at github.com/Anmol202005/axon/issues with a redacted copy of your ~/.axon/config.json(provider, model, endpoint — never the key) and the error you saw.