Why Elyra

A coding agent built around not wasting tokens.

Elyra is a terminal-native, multi-provider coding agent. It runs locally, supports 30+ LLM providers, and is engineered from the ground up for cost efficiency, privacy, and extensibility.

~/my-app — elyra
turn 1  · plan the refactor
 read src/auth/session.ts
 grep "createSession" src/
 3 files, 142 lines analyzed

turn 2  · execute
 edit src/auth/session.ts
 edit src/api/login.ts
 2 files modified

turn 3  · verify
 bash: pnpm test auth
 24 passed, 0 failed

─────────────────────────────
cost  $0.23   cache  71%
30+
LLM providers supported
$0.10–0.50
Cost per 30-min session
60–80%
Cache hit rate
0
Telemetry, cloud, accounts

01 · The loop

What Elyra actually does

Elyra is an agentic loop, not a single LLM call. On every turn it decides what to do next based on what it just saw.

  1. 01 It reads your request and clarifies the goal.
  2. 02 It explores the relevant code — reading files, running grep, tracing dependencies.
  3. 03 It plans and executes — editing files, creating new ones, running commands.
  4. 04 It checks its own work — runs tests, reads output, fixes errors.
  5. 05 It repeats until done, or asks you a question.

This isn't one LLM call. It's a loop of tool calls: read a file, edit another, run a test, read the error, fix the code, run again. The agent decides what to do at each step based on what it sees.

02 · Compatibility

Works with what you already use

Built-in tools are generic. Extensions add the depth.

30+ LLM providers

Anthropic, OpenAI, Google, xAI, Mistral, Groq, DeepSeek, and more. Bring your own API key (or use multiple). Switch models mid-session with /model. Token usage and cost tracked automatically.

Anthropic OpenAI Google xAI Mistral Groq DeepSeek Cohere OpenRouter +22 more
Any language or framework

Built-in tools are generic: read files, write files, edit files, search with grep, run shell commands. The extension system adds deep knowledge for specific stacks.

03 · Cost efficiency

Engineered to minimize token waste

Elyra is designed to minimize token waste at every level — from how context is compacted to how models are routed.

Smart compaction
200 → 30
messages compacted

Long sessions don't burn tokens proportionally. When context grows large, Elyra summarizes old conversation history — keeping key decisions and file changes while freeing tokens.

Free token estimation
per threshold check

Compaction checks use a cached chars/4 heuristic (memoized via WeakMap) instead of calling a tokenizer or the LLM. The system knows when compaction is needed without spending tokens to find out.

Context pinning
input ≪ output
cheaper than re-reads

Without pinning, the agent re-reads important files every few turns — each call consuming output tokens. A pinned file is injected as input context (often cached), dramatically cheaper.

Smart model routing
$2 → $0.40
on a routed session

The router sends simple tasks (file reads, small edits) to cheap models and reserves expensive reasoning models for complex tasks (architecture, multi-file refactors, debugging).

Concurrent tools
parallel-safe
tool execution

When the agent calls multiple tools in one turn, parallel-safe ones run concurrently. Faster wall-clock time means less overhead on rate-limited or time-charged providers.

Injected, not generated
cacheable
input tokens

Stack extensions inject framework knowledge into the system prompt instead of having the agent web-search docs at runtime. Knowledge is already there, as cacheable input tokens.

Automatic cache hits
60–80%
cache hit rate

When providers support prompt caching (Anthropic, OpenAI), Elyra structures requests to maximize hits. Repeated context is cached server-side — a fraction of fresh token cost. /cost shows the breakdown.

The result

A typical 30-minute coding session with Elyra costs $0.10–0.50 depending on the model and task complexity. Sessions that would burn through dollars of context on other agents stay under a dollar — because the architecture is built around not wasting tokens.

<$1
per session, typically

04 · Extensions

Specialized expertise, on demand

Extensions give the agent specialized expertise. Install with elyra install npm:@elyracode/<name> and the agent gains new tools, new context, and new capabilities.

Framework stacks

These teach the agent how your stack works. @elyracode/stack-tall knows Livewire 4, Flux UI, Alpine.js, and Tailwind. @elyracode/stack-vilt knows Vue 3 with Inertia.js. @elyracode/stack-rilt knows React 19 with Inertia and shadcn/ui. Not just documentation dumps — they include component references, coding patterns, and architectural guidance that shape how the agent writes code for your project.

Development tools

Capabilities beyond code editing

@elyracode/db-tools

Query MySQL, ClickHouse, or SQLite directly. Schema discovery, read-only queries, migration awareness.

@elyracode/design-tools

Live browser preview of Tailwind components, screenshot capture for visual QA, design consistency analysis.

@elyracode/doctor

Project health audits: outdated deps, security vulnerabilities, code debt, auto-healing.

@elyracode/test-gen

Generates tests following your project's patterns. Pest for Laravel, Vitest for TypeScript.

@elyracode/perf-tools

Finds N+1 queries, slow queries, missing database indexes.

@elyracode/docker

Docker-aware development: container exec routing, log tailing, compose operations, .env sync between host and containers.

@elyracode/git-intel

Session briefings from recent git activity, commit message generation, PR descriptions.

@elyracode/laravel-starters

Fetches official Laravel starter kits from GitHub as reference context.

@elyracode/http-tools

API testing, documentation fetching, OpenAPI spec parsing.

Multi-agent orchestration

Split complex tasks across focused agents

@elyracode/subagents

Delegate to specialized agents: scout (codebase recon), reviewer (code review), planner (implementation planning), worker (implementation), oracle (second opinions).

@elyracode/swarm

Automated pipelines. /swarm build runs plan-code-test-review-fix. /swarm review runs multi-pass security, correctness, and test reviews. /swarm refactor runs analyze-plan-implement-verify.

05 · Daily use

A workspace, not just a chatbox

Session management

Persistent sessions. Your conversation history, tool results, and context survive across restarts.

Resume previous sessions
/resume
Fork from any point
/fork
Branch & navigate history
/tree
Export as HTML or JSONL
/export
Compact long sessions
/compact

Context management

Long sessions accumulate context. Elyra handles this automatically.

  • Smart compactionSummarizes old history, preserves key decisions.
  • Context pinning /pinRe-read from disk on every call, always fresh.
  • Project memoryArchitecture, patterns, conventions across sessions.
  • Stack detectionIdentifies your framework, suggests extensions.

The terminal interface

A full interactive environment, not just a prompt.

  • Markdown with syntax highlighting
  • Inline image display
  • Autocomplete for commands & files
  • Six themes incl. norwegian-midnight
  • Shortcuts for models & thinking level
  • Diff display with add/remove highlights

06 · Blueprints

Session templates for your team

Drop a markdown file in .elyra/blueprints/ and apply it with /blueprint <name>. The file content becomes the first message, and optional YAML frontmatter can pin files automatically.

This standardizes how the agent approaches common tasks across a team.

.elyra/blueprints/api-endpoint.md
---
pin: src/routes/api.ts, src/middleware/auth.ts
---
Build a new API endpoint following the patterns
in the pinned files.

Use the existing auth middleware. Add request
validation and tests.

07 · Cost tracking

Know exactly what you're spending

Every session tracks token usage and cost. /cost shows input tokens, output tokens, cache utilization, estimated dollar cost, and context window usage. No external service needed — the data comes directly from provider responses.

/cost
Session usage  ·  claude-sonnet-4

Input tokens        142,318
  ↳ cached          102,440  (72%)
  ↳ fresh            39,878
Output tokens        8,214

Context window      38% used
Estimated cost      $0.23

08 · Principles

What it doesn't do

No cloud

Elyra doesn't run in the cloud. It runs locally, in your terminal, on your machine. Your code never leaves your computer (except to the LLM provider you choose).

No account

It doesn't require an account. Install it, set an API key, and start.

No surprises

It doesn't make decisions for you on things that matter. It asks before deleting files, before architectural changes, before anything irreversible. You're in control; it's the tool.

Ready to try it?

Install with one npm command. No account, no cloud, no telemetry.

$ npm i -g @elyracode/coding-agent