Why these three, together

We've written before that Elyra is, at its heart, a loop: observe, reason, act, repeat. The loop is powerful precisely because it keeps going — and that's also why people hesitate to let it. Three worries come up again and again:

"What will it cost if I walk away?"
"Would a different model have done this better?"
"Why do I keep explaining the same workflow every week?"

0.9.8 answers each one with a command.

`/goal --budget` — autonomy with a spending cap

/goal has always been Elyra's purest promise: set a finish line, walk away, come back to green. But unbounded loops and premium models make a nervous combination — especially now that flagship models run $10/$50 per million tokens.

So the goal loop got a wallet:

/goal npm test --budget 2.50

The agent works exactly as before — run the tests, read the failures, fix, repeat — but now it tracks spend from the moment the goal was set. If the goal is met first, wonderful. If the cap is reached first, the loop stops, reports, and keeps everything done so far:

Goal budget reached: spent $2.51 of $2.50 without meeting the goal.
Goal was: npm test. The work so far is kept. Re-run /goal to continue
with a fresh budget.

One honest detail: the check runs between turns, so a long final turn can nudge slightly past the cap — it's a guardrail, not a hardware fuse. But the psychology changes completely. "Set it and walk away" becomes "set it, walk away, and know your worst case." That's the difference between trying autonomy once and using it daily.

`/replay` — your own benchmark, on your own work

Every week there's a new model and a new leaderboard. None of them answer the only question that matters: would it do better on my code, my tasks, my weird legacy module?

Elyra has quietly had the infrastructure to answer that for months — every turn is checkpointed, and sessions branch like a git tree. 0.9.8 connects the dots:

> Refactor the discount calculation to handle stacked coupons
  ... (Claude works, you're not quite convinced)

/replay haiku

Elyra rewinds to the checkpoint before that turn, opens a new branch, switches to the model you named, and re-runs your exact prompt. The original attempt isn't lost — both branches live in the tree:

Replaying last turn with Claude Haiku 4.5 (was Claude Fable 5).
The original branch is kept — compare with /tree.
...
Replay with Claude Haiku 4.5 complete. Replay cost: $0.0214.
Use /tree to compare branches.

Same task, two models, side by side, with the cost difference in plain sight. Maybe the cheap model handles it fine and you just saved 20x on that class of task. Maybe it doesn't, and now you know rather than guess. Either way: that's a benchmark no leaderboard can give you, because it ran on the only workload that matters — yours.

`/learn` — teach it once

Elyra's tagline has always been "self-extensible." Skills — little SKILL.md files that teach the agent procedures — are how that works. But until now, writing them was homework.

Now the session itself is the raw material. Spent twenty minutes walking the agent through your release process? Your migration recipe? The way your team writes integration tests? Before you close the terminal:

/learn

Elyra re-reads the conversation, distills the procedure (not the noise — it deliberately prefers your corrected, final approach over the trial and error), and writes a draft:

Skill learned: release-flow
Saved to ~/.elyra/agent/skills/release-flow/SKILL.md
Review/edit the file, then /reload to activate it.
It will load in all future sessions.

You review it — it's a draft, not gospel — tweak a line if needed, /reload, and the agent knows that workflow forever. You can steer it too: /learn the deploy process we just did. Every week of use makes your agent a little more yours. That compounding is the whole point.

And two quiet ones

probe_models (in the doctor extension): trust, but verify. It makes tiny live calls to your configured providers and checks that reality matches what the model registry claims — availability, reasoning, thinking mode. We built it after catching exactly that kind of drift ourselves; now the agent can catch it for you.
elyra update tells the truth. If you're already current, it now says Elyra is up to date (0.9.8) instead of cheerfully reinstalling and claiming victory. Small thing. Felt good to fix.

How to upgrade

npm install -g @elyracode/coding-agent

Or from inside a session:

/update

(And yes — it will now tell you when there's nothing to do.)

The short version

/goal <cmd> --budget <usd> — the loop stops at your spending cap, keeping the work.
/replay <model> — re-run your last turn with another model on a kept branch; compare with /tree.
/learn — turn this session's workflow into a permanent skill.
probe_models — live-verify providers against the registry.
elyra update — honest about being up to date.

A wallet, a rival, and a memory. None of them make the agent smarter — they make it more trustworthy, more comparable, and more yours. Which, day to day, matters more.

Happy building.