Token Spend Out of Control? The Case for…

Jun 8

To understand how teams keep this under control in production, we sat down with Scott Breitenother and Sid Sijbrandij, co-founders of Kilo, an open-source coding agent that runs through a lot of these loops every day.

Read →

5 Comments

Mitchell Kosowski

Jun 8

The caching finding surprised me: 80%+ reuse and the bill still stays high, when most of us treat caching as the main lever. The point that lands hardest is "route on a signal you already have": most teams know whether a call is planning vs. a simple edit but they just never pass that down to the router.

Curious how you handle the quality hit when a tier swaps model families mid-task and has to drop the intermediate reasoning... do you tend to re-plan or just eat the context loss?

Clint Cain

Jun 8

Very detailed and to identify how to go about this chaotic landscape of choice for your build.

ZPF

Jun 8

https://substack.com/@zpftechnologies?r=86qmqm&utm_medium=ios&utm_source=stories&shareImageVariant=blur

Step into the frontier of quantum discoveries. Join our community advancing boundary-conditioned ZPF research—subscribe to support the mission and stay connected.

Jay Webster

Jun 8

Love this. Thank you.

Ariel Smoliar

Jun 8

Great framing! Static tier routing is table stakes now, the next layer is dynamic: factor in live provider health, rate limits, and whether the cheaper model actually produced equivalent outcomes.

That's what we're building at LOCO-Agent, open-source

github.com/ArielSmoliar/loco-agent

ByteByteGo Newsletter

Token Spend Out of Control? The Case for…