Quick Start
Get AXIS running in your project. Install the CLI, initialize a config, then iterate: run, review, and baseline.
Prerequisites
- Node.js 18 or later.
- An API key for at least one supported agent (for example,
ANTHROPIC_API_KEYfor Claude Code).
1. Install the CLI
Install AXIS globally so the axis binary is available on your PATH:
npm install -g @netlify/axis
Or, to skip the install and run directly with npx, prefix any command with
npx @netlify/axis (for example, npx @netlify/axis init).
2. Initialize Your Project
From your project root, run:
axis init
In an interactive terminal, this prompts you for the scenarios directory and which agents to
include (comma-separated, e.g. claude-code,codex,gemini). It then creates two files:
axis.config.json-minimal config with your chosen scenarios path and agents.scenarios/hello-world.json-a sample scenario that asks the agent to create a file with specific content.
To skip the prompts, pass flags directly:
axis init --agent claude-code,codex --scenarios ./scenarios See the Configuration Reference for additional options like scoring weights, MCP servers, and custom agents, and Writing Scenarios for guidance on writing effective prompts and rubrics.
3. Run It
axis run AXIS spawns the agent in an isolated workspace, captures the full interaction transcript, scores the result against your rubric, and displays a summary in your terminal.
What to Expect
The terminal displays a live progress view while the agent runs. You will see each scenario/agent combination with its current status (running, scoring, done, or failed) and a live token counter showing how many tokens the agent has consumed.
Once scoring completes, AXIS prints a summary table showing:
- The composite AXIS Result (0 to 100) for each scenario/agent pair.
- Breakdowns for each of the four scoring dimensions: Goal Achievement, Environment, Service, and Agent.
- Score insights for any dimension that scored below 75, identifying the weakest signal.
4. View the Report
Every run saves a report to .axis/reports/. You can revisit it at any time.
# View the latest report summary
axis reports latest
# Open the HTML report in your browser
axis reports latest --html
# Get JSON output for scripting
axis reports latest --json The HTML report includes the full scoring breakdown, interaction transcript, and judge evaluations. See Reports & Baselines for details on report contents and storage.
5. Interpret Your Results
The AXIS Result is a composite of four dimensions, each measuring a different aspect of the agent's interaction with your system.
| Dimension | What It Tells You |
|---|---|
| Goal Achievement | Did the agent complete the task? Scored against your rubric checks. |
| Environment | How well did shell commands, file operations, and dev tools work? |
| Service | How effectively were APIs, MCP tools, and external services used? |
| Agent | How well did the agent reason and plan? Were its actions necessary and well-scoped? |
A score of 50 represents median performance. Scores above 75 are good; above 90 is excellent. See Scoring Framework for the full explanation of how each dimension is calculated and why.
6. Set a Baseline
Once you have a run you are satisfied with, save it as a baseline. Future runs can diff against it to detect regressions.
# Save the latest report as a baseline
axis baseline set
# Compare future runs automatically
axis run --compare-baseline The comparison exits with code 1 if any regressions are detected, making it suitable for CI gating. See Reports & Baselines for baseline workflows.
Add .axis/reports/ and .axis/skills-cache/ to your
.gitignore. Baselines (.axis/baselines/) are designed to be
committed so the whole team shares the same regression thresholds.
Next Steps
- Scoring Framework -how the four dimensions are calculated, what signals drive each score, and why the scoring works the way it does.
- Writing Scenarios -how to write effective prompts, design rubrics, and use setup/teardown actions.
- Execution & Agents -how AXIS runs scenarios, supported and custom agents, and workspace isolation.