Discussion about this post

User's avatar
Mitchell Kosowski's avatar

The caching finding surprised me: 80%+ reuse and the bill still stays high, when most of us treat caching as the main lever. The point that lands hardest is "route on a signal you already have": most teams know whether a call is planning vs. a simple edit but they just never pass that down to the router.

Curious how you handle the quality hit when a tier swaps model families mid-task and has to drop the intermediate reasoning... do you tend to re-plan or just eat the context loss?

Clint Cain's avatar

Very detailed and to identify how to go about this chaotic landscape of choice for your build.

3 more comments...

No posts

Ready for more?