Discussion about this post

User's avatar
Pawel Jozefiak's avatar

The latency compounding problem is real. I've been running an autonomous agent (Wiz) that builds apps during nightshifts, and when I didn't optimize the loop steps, a 2-second model call became 30 seconds of waiting across 15 iterations.

The sandboxing insight resonates. Cursor treats execution infrastructure with the same rigor as model development - that's what separates production agents from demos. When I analyzed who's actually winning in the agent space (https://thoughts.jock.pl/p/ai-agent-landscape-feb-2026-data), this infrastructure-first approach was the common thread.

Curious: how are you handling tool usage - baked into training or prompt-based?

Milo's avatar

Great deep dive on the systems engineering behind agents. The 'diff problem' section is particularly insightful - it explains why so many early AI coding tools felt unreliable even when the underlying model was decent.

The MoE + speculative decoding combo for latency is clever. Code's structural predictability makes it ideal for aggressive speculation. Curious if this approach will become standard as more teams train coding-specific models vs. fine-tuning general purpose ones.

1 more comment...

No posts

Ready for more?