A coding agent built around not wasting tokens.
Elyra is a terminal-native, multi-provider coding agent. It runs locally, supports 30+ LLM providers, and is engineered from the ground up for cost efficiency, privacy, and extensibility.
turn 1 · plan the refactor ➜ read src/auth/session.ts ➜ grep "createSession" src/ ✓ 3 files, 142 lines analyzed turn 2 · execute ➜ edit src/auth/session.ts ➜ edit src/api/login.ts ✓ 2 files modified turn 3 · verify ➜ bash: pnpm test auth ✓ 24 passed, 0 failed ───────────────────────────── cost $0.23 cache 71%
01 · The loop
What Elyra actually does
Elyra is an agentic loop, not a single LLM call. On every turn it decides what to do next based on what it just saw.
- 01 It reads your request and clarifies the goal.
- 02 It explores the relevant code — reading files, running grep, tracing dependencies.
- 03 It plans and executes — editing files, creating new ones, running commands.
- 04 It checks its own work — runs tests, reads output, fixes errors.
- 05 It repeats until done, or asks you a question.
This isn't one LLM call. It's a loop of tool calls: read a file, edit another, run a test, read the error, fix the code, run again. The agent decides what to do at each step based on what it sees.
02 · Compatibility
Works with what you already use
Built-in tools are generic. Extensions add the depth.
Anthropic, OpenAI, Google, xAI, Mistral, Groq, DeepSeek, and more. Bring your own API key (or use multiple). Switch models mid-session with /model. Token usage and cost tracked automatically.
Built-in tools are generic: read files, write files, edit files, search with grep, run shell commands. The extension system adds deep knowledge for specific stacks.
03 · Cost efficiency
Engineered to minimize token waste
Elyra is designed to minimize token waste at every level — from how context is compacted to how models are routed.
Long sessions don't burn tokens proportionally. When context grows large, Elyra summarizes old conversation history — keeping key decisions and file changes while freeing tokens.
Compaction checks use a cached chars/4 heuristic (memoized via WeakMap) instead of calling a tokenizer or the LLM. The system knows when compaction is needed without spending tokens to find out.
Without pinning, the agent re-reads important files every few turns — each call consuming output tokens. A pinned file is injected as input context (often cached), dramatically cheaper.
The router sends simple tasks (file reads, small edits) to cheap models and reserves expensive reasoning models for complex tasks (architecture, multi-file refactors, debugging).
When the agent calls multiple tools in one turn, parallel-safe ones run concurrently. Faster wall-clock time means less overhead on rate-limited or time-charged providers.
Stack extensions inject framework knowledge into the system prompt instead of having the agent web-search docs at runtime. Knowledge is already there, as cacheable input tokens.
When providers support prompt caching (Anthropic, OpenAI), Elyra structures requests to maximize hits. Repeated context is cached server-side — a fraction of fresh token cost. /cost shows the breakdown.
A typical 30-minute coding session with Elyra costs $0.10–0.50 depending on the model and task complexity. Sessions that would burn through dollars of context on other agents stay under a dollar — because the architecture is built around not wasting tokens.
04 · Extensions
Specialized expertise, on demand
Extensions give the agent specialized expertise. Install with elyra install npm:@elyracode/<name> and the agent gains new tools, new context, and new capabilities.
These teach the agent how your stack works. @elyracode/stack-tall knows Livewire 4, Flux UI, Alpine.js, and Tailwind. @elyracode/stack-vilt knows Vue 3 with Inertia.js. @elyracode/stack-rilt knows React 19 with Inertia and shadcn/ui. Not just documentation dumps — they include component references, coding patterns, and architectural guidance that shape how the agent writes code for your project.
Capabilities beyond code editing
@elyracode/db-tools
Query MySQL, ClickHouse, or SQLite directly. Schema discovery, read-only queries, migration awareness.
@elyracode/design-tools
Live browser preview of Tailwind components, screenshot capture for visual QA, design consistency analysis.
@elyracode/doctor
Project health audits: outdated deps, security vulnerabilities, code debt, auto-healing.
@elyracode/test-gen
Generates tests following your project's patterns. Pest for Laravel, Vitest for TypeScript.
@elyracode/perf-tools
Finds N+1 queries, slow queries, missing database indexes.
@elyracode/docker
Docker-aware development: container exec routing, log tailing, compose operations, .env sync between host and containers.
@elyracode/git-intel
Session briefings from recent git activity, commit message generation, PR descriptions.
@elyracode/laravel-starters
Fetches official Laravel starter kits from GitHub as reference context.
@elyracode/http-tools
API testing, documentation fetching, OpenAPI spec parsing.
Split complex tasks across focused agents
@elyracode/subagents
Delegate to specialized agents: scout (codebase recon), reviewer (code review), planner (implementation planning), worker (implementation), oracle (second opinions).
@elyracode/swarm
Automated pipelines.
/swarm build runs plan-code-test-review-fix.
/swarm review runs multi-pass security, correctness, and test reviews.
/swarm refactor runs analyze-plan-implement-verify.
05 · Daily use
A workspace, not just a chatbox
Session management
Persistent sessions. Your conversation history, tool results, and context survive across restarts.
- Resume previous sessions
/resume- Fork from any point
/fork- Branch & navigate history
/tree- Export as HTML or JSONL
/export- Compact long sessions
/compact
Context management
Long sessions accumulate context. Elyra handles this automatically.
- Smart compactionSummarizes old history, preserves key decisions.
- Context pinning
/pinRe-read from disk on every call, always fresh. - Project memoryArchitecture, patterns, conventions across sessions.
- Stack detectionIdentifies your framework, suggests extensions.
The terminal interface
A full interactive environment, not just a prompt.
- Markdown with syntax highlighting
- Inline image display
- Autocomplete for commands & files
- Six themes incl. norwegian-midnight
- Shortcuts for models & thinking level
- Diff display with add/remove highlights
06 · Blueprints
Session templates for your team
Drop a markdown file in .elyra/blueprints/ and apply it with /blueprint <name>. The file content becomes the first message, and optional YAML frontmatter can pin files automatically.
This standardizes how the agent approaches common tasks across a team.
---
pin: src/routes/api.ts, src/middleware/auth.ts
---
Build a new API endpoint following the patterns
in the pinned files.
Use the existing auth middleware. Add request
validation and tests.
07 · Cost tracking
Know exactly what you're spending
Every session tracks token usage and cost. /cost shows input tokens, output tokens, cache utilization, estimated dollar cost, and context window usage. No external service needed — the data comes directly from provider responses.
Session usage · claude-sonnet-4 Input tokens 142,318 ↳ cached 102,440 (72%) ↳ fresh 39,878 Output tokens 8,214 Context window 38% used Estimated cost $0.23
08 · Principles
What it doesn't do
Elyra doesn't run in the cloud. It runs locally, in your terminal, on your machine. Your code never leaves your computer (except to the LLM provider you choose).
It doesn't require an account. Install it, set an API key, and start.
It doesn't make decisions for you on things that matter. It asks before deleting files, before architectural changes, before anything irreversible. You're in control; it's the tool.
Ready to try it?
Install with one npm command. No account, no cloud, no telemetry.