We’re dealing with GPU workloads, so the problem is quite different from CPU workers. Instead of relying on warm instances, we snapshot the full CPU and GPU execution state, so restore behaves like a resume rather than re-initialization.
Strong point, and it matches my own tests. I tested Sonnet 4.6 on real coding and long context workflows, and the pattern was consistent: results improved when I enforced agent handoff boundaries and explicit success checks. When context got noisy, reliability dropped even if raw speed looked good. I documented both experiments, including a personal context test, here: https://thoughts.jock.pl/p/sonnet-46-two-experiments-one-got-personal
Have you found a reliable way to catch silent failures early? I would love to compare notes on your exact setup.
We’re dealing with GPU workloads, so the problem is quite different from CPU workers. Instead of relying on warm instances, we snapshot the full CPU and GPU execution state, so restore behaves like a resume rather than re-initialization.
https://github.com/inferx-net/inferx
Very Interesting take. I have written this article about Cloudlfare and would be happy if you can check it out :)
https://marketsminds.substack.com/p/deep-dive-cloudflare-the-internets?utm_campaign=post-expanded-share&utm_medium=post%20viewer
Strong point, and it matches my own tests. I tested Sonnet 4.6 on real coding and long context workflows, and the pattern was consistent: results improved when I enforced agent handoff boundaries and explicit success checks. When context got noisy, reliability dropped even if raw speed looked good. I documented both experiments, including a personal context test, here: https://thoughts.jock.pl/p/sonnet-46-two-experiments-one-got-personal
Have you found a reliable way to catch silent failures early? I would love to compare notes on your exact setup.