|
Hello,
Runtime data is becoming a competitive advantage. For developers it shortens feedback
loops. For AI coding agents, it can turn a long token-burning debugging session into a
focused fix.
In this newsletter: recent numbers on agent token usage, how Wallaby's
AI tools help, the Wallaby for Python release, Quokka v3, and Console Ninja's role in
runtime debugging.
AI agents need runtime data
Agents are getting better at editing code. The expensive part is what happens next:
inspect, run, misunderstand, search, rerun, paste another stack trace, repeat. Recent
research and reports show where that cost comes from:
-
The same agentic coding task can vary by up to 30x in token usage across runs.
More tokens did not reliably mean better accuracy.
Read the study.
-
In one multi-agent software workflow study, the Code Review stage used 59.4%
of tokens. Initial coding used 8.6%. Input tokens made up 53.9% overall.
Read the paper.
-
In one controlled A/B report, giving the agent live project context used
42% fewer tokens, ran 27% faster, and made 64% fewer tool calls.
Read the report.
-
Research into agent-written tests found that many act as runtime probes:
value-revealing print statements outnumbered assertions in the analyzed test artifacts.
Read the paper summary.
The pattern is consistent: the cost is in the loop, not the first edit. When agents
lack runtime data, they spend tokens rediscovering state. When they can inspect the
right signal, they can move faster and verify the fix sooner.
Wallaby gives agents those signals directly: test status, failing test details,
runtime values, coverage, execution paths, and logs.
What users report: Wallaby's AI integration often reduces token usage by
around 50% and resolves issues around 2-3x faster than workflows without
Wallaby runtime context.
Don't take our word for it. Try the same task with and without Wallaby runtime context
and compare turns, tool calls, tokens, and whether the fix is actually verified.
Wallaby CLI + AI tools
Use Wallaby CLI when your
agent needs to launch Wallaby itself. Claude Code, Codex, GitHub Copilot,
JetBrains Junie/AI Assistant, OpenCode, Pi, Cursor, Cline, Devin, and other agents
can run Wallaby locally, in worktrees, in containers, or on remote machines whenever
they need test and runtime feedback.
Read more about Wallaby CLI.
Use Wallaby AI tools and MCP
when Wallaby and your agent are running side by side. In the same local context,
your agent can ask the Wallaby extension for accurate test results, runtime values,
and coverage data on demand. Less guesswork, fewer wasted tokens, and faster
convergence on working code.
Wallaby for Python
Wallaby for Python is now available. It brings
Wallaby's instant test feedback, inline errors, coverage visualization, runtime values,
and AI-agent tooling to Python projects.
Wallaby for Python supports pytest and unittest, works with Python
3.8 through 3.14, and is available for VS Code editors. If you or
your team work across JavaScript/TypeScript and Python, you can now use the same
fast feedback loop across both ecosystems.
For AI agents: Wallaby for Python exposes the same MCP/AI runtime context for
Python tests, so agents can inspect failing tests, coverage, and runtime values instead
of guessing from source and raw terminal output.
Try Wallaby for Python →
Quokka v3
Quokka v3 is now available with
a redesigned output experience that makes runtime values easier to navigate, inspect,
compare, and understand.
The new list/details workflow, richer value exploration, output while code is running,
integrated diffs, and runtime value diagrams make Quokka better for exploratory coding
and debugging.
Read more about Quokka v3 →
Console Ninja
Console Ninja completes the runtime picture for app debugging. It gives your AI agent
browser/server logs and runtime errors from your running application, especially when the
bug is outside a unit test.
Learn more about Console Ninja PRO →
Use fewer tokens. Get better fixes.
Agentic coding is powerful, but every blind loop costs time and money. Wallaby and
Console Ninja give agents verified runtime facts, so they spend less time guessing and
more time converging on a working fix.
That is the edge: not a bigger prompt, but better feedback. Fewer wasted tokens, faster
verified fixes, and a workflow your agent can actually reason from.
Now is a good time to renew, add the missing pieces, or try the workflow
on a real problem in your own codebase. If you're already using Wallaby or Ninja with an
AI coding agent, reply and tell us what works, what doesn't, and what data your agent
still struggles to get.
Thanks for reading!
Regards, Simon McEnlly
|