<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[ByteByteGo Newsletter]]></title><description><![CDATA[Explain complex systems with simple terms, from the authors of the best-selling system design book series. Join over 1,000,000 friendly readers.]]></description><link>https://blog.bytebytego.com</link><image><url>https://substackcdn.com/image/fetch/$s_!1eXV!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F8a5609ae-1239-4400-9491-6010a15c4d60_504x504.png</url><title>ByteByteGo Newsletter</title><link>https://blog.bytebytego.com</link></image><generator>Substack</generator><lastBuildDate>Sat, 30 May 2026 22:22:02 GMT</lastBuildDate><atom:link href="https://blog.bytebytego.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[ByteByteGo]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[alex@bytebytego.com]]></webMaster><itunes:owner><itunes:email><![CDATA[alex@bytebytego.com]]></itunes:email><itunes:name><![CDATA[Alex Xu]]></itunes:name></itunes:owner><itunes:author><![CDATA[Alex Xu]]></itunes:author><googleplay:owner><![CDATA[alex@bytebytego.com]]></googleplay:owner><googleplay:email><![CDATA[alex@bytebytego.com]]></googleplay:email><googleplay:author><![CDATA[Alex Xu]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[How DoorDash Built a Testing System to Evaluate LLMs]]></title><description><![CDATA[In this article, we will learn how they built this flywheel and the key takeaways.]]></description><link>https://blog.bytebytego.com/p/how-doordash-built-a-testing-system</link><guid isPermaLink="false">https://blog.bytebytego.com/p/how-doordash-built-a-testing-system</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Sat, 30 May 2026 15:30:52 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!L2Ta!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5832df44-5f71-4dcf-b4e9-6f38f771758d_2054x1852.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><a href="https://go.bytebytego.com/Datadog_060126">How to Track AI ROI in Real Time (Sponsored)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/Datadog_060126" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eGG5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e514377-9065-4d16-b57c-160afec4e714_2800x1422.png 424w, https://substackcdn.com/image/fetch/$s_!eGG5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e514377-9065-4d16-b57c-160afec4e714_2800x1422.png 848w, https://substackcdn.com/image/fetch/$s_!eGG5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e514377-9065-4d16-b57c-160afec4e714_2800x1422.png 1272w, https://substackcdn.com/image/fetch/$s_!eGG5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e514377-9065-4d16-b57c-160afec4e714_2800x1422.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eGG5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e514377-9065-4d16-b57c-160afec4e714_2800x1422.png" width="1456" height="739" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3e514377-9065-4d16-b57c-160afec4e714_2800x1422.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:739,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1988902,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://go.bytebytego.com/Datadog_060126&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/199798126?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e514377-9065-4d16-b57c-160afec4e714_2800x1422.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eGG5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e514377-9065-4d16-b57c-160afec4e714_2800x1422.png 424w, https://substackcdn.com/image/fetch/$s_!eGG5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e514377-9065-4d16-b57c-160afec4e714_2800x1422.png 848w, https://substackcdn.com/image/fetch/$s_!eGG5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e514377-9065-4d16-b57c-160afec4e714_2800x1422.png 1272w, https://substackcdn.com/image/fetch/$s_!eGG5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e514377-9065-4d16-b57c-160afec4e714_2800x1422.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Datadog&#8217;s guide shows you how to connect AI spend, infrastructure, and model performance into a single view, so you can catch cost spikes the moment they happen. See how Kevel cut AWS costs by up to $100,000/month after replacing reactive cost reviews with real-time visibility.<br><br>You&#8217;ll learn how to:</p><ul><li><p>Break down AI costs by token, model, provider, and team</p></li><li><p>Get alerted the instant inference volume spikes or API spend exceeds budget</p></li><li><p>Correlate cost increases directly to architecture changes so root-cause analysis takes minutes</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/Datadog_060126&quot;,&quot;text&quot;:&quot;Get the guide&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://go.bytebytego.com/Datadog_060126"><span>Get the guide</span></a></p><div><hr></div><p style="text-align: justify;">DoorDash&#8217;s customer support chatbot had a hallucination problem. Not the dramatic kind where it invents entire conversations, but the subtle, harder-to-catch kind.</p><p style="text-align: justify;">For example, the chatbot would look at a customer&#8217;s order history, see a delivery status field, misread it, and then confidently suggest a refund policy that didn&#8217;t actually exist. The raw data was right there in the chatbot&#8217;s context window, the working memory where an LLM holds everything it needs to generate a response, but having too much information was making things worse.</p><p style="text-align: justify;">For reference, DoorDash is one of the largest food delivery and local commerce platforms in the United States, connecting customers with restaurants and stores through a network of independent delivery drivers called Dashers.</p><p style="text-align: justify;">At that scale, the company handles hundreds of thousands of support contacts every day from customers, merchants, and Dashers, making automated support not just a nice-to-have but a necessity.</p><p style="text-align: justify;">The team could see the problem clearly, but fixing it was a different story. Every change they made to reduce hallucinations in one scenario risked creating new ones in another. They were stuck between two bad options. They could deploy changes to production and hope for the best, which meant risking real customer experiences. Or they could manually test dozens of conversation scenarios for every prompt change, which would take weeks and still might miss things.</p><p style="text-align: justify;">This tension isn&#8217;t unique to DoorDash. It&#8217;s the fundamental challenge anyone faces when they move from traditional deterministic software to LLM-based systems. DoorDash used to run customer support on hand-built decision trees, where every change had a predictable, traceable impact. LLMs replaced that predictability with flexibility and more natural conversations, but they also introduced non-determinism, meaning the same input can produce different outputs each time.</p><p style="text-align: justify;">DoorDash&#8217;s answer to this problem wasn&#8217;t a better chatbot. It was a better system for improving the chatbot, something they call the simulation and evaluation flywheel. In this article, we will learn how they built this flywheel and the key takeaways.</p><p style="text-align: justify;"><em>Disclaimer: This post is based on publicly shared details from the DoorDash Engineering Team. Please comment if you notice any inaccuracies.</em></p><h2 style="text-align: justify;">What the Flywheel Actually Does</h2><p style="text-align: justify;">The flywheel has two interconnected pieces:</p><ul><li><p style="text-align: justify;">The first is an offline simulator that generates realistic multi-turn customer conversations without involving any real customers.</p></li><li><p style="text-align: justify;">The second is an evaluation framework that automatically grades how the chatbot performed in those conversations.</p></li></ul><p style="text-align: justify;">Together, they create a tight iteration loop.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4eth!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f68eb05-7dc8-448a-b5ab-ad59f8da6c84_1688x1192.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4eth!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f68eb05-7dc8-448a-b5ab-ad59f8da6c84_1688x1192.png 424w, https://substackcdn.com/image/fetch/$s_!4eth!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f68eb05-7dc8-448a-b5ab-ad59f8da6c84_1688x1192.png 848w, https://substackcdn.com/image/fetch/$s_!4eth!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f68eb05-7dc8-448a-b5ab-ad59f8da6c84_1688x1192.png 1272w, https://substackcdn.com/image/fetch/$s_!4eth!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f68eb05-7dc8-448a-b5ab-ad59f8da6c84_1688x1192.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4eth!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f68eb05-7dc8-448a-b5ab-ad59f8da6c84_1688x1192.png" width="1456" height="1028" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1f68eb05-7dc8-448a-b5ab-ad59f8da6c84_1688x1192.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1028,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:108732,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/199798126?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f68eb05-7dc8-448a-b5ab-ad59f8da6c84_1688x1192.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4eth!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f68eb05-7dc8-448a-b5ab-ad59f8da6c84_1688x1192.png 424w, https://substackcdn.com/image/fetch/$s_!4eth!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f68eb05-7dc8-448a-b5ab-ad59f8da6c84_1688x1192.png 848w, https://substackcdn.com/image/fetch/$s_!4eth!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f68eb05-7dc8-448a-b5ab-ad59f8da6c84_1688x1192.png 1272w, https://substackcdn.com/image/fetch/$s_!4eth!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f68eb05-7dc8-448a-b5ab-ad59f8da6c84_1688x1192.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://careersatdoordash.com/blog/doordash-simulation-evaluation-flywheel-to-develop-llm-chatbots-at-scale/">DoorDash Engineering Blog</a></figcaption></figure></div><p style="text-align: justify;">When the team notices a problem, they write an evaluation that captures the specific failure mode they want to fix. A single job trigger then orchestrates the entire pipeline end-to-end, automatically generating test scenarios from historical transcripts, running multi-turn conversations between the simulator and the chatbot, and evaluating the results.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!USZ5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735b862b-b37e-4817-99a1-1c498fefd6c2_2054x1466.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!USZ5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735b862b-b37e-4817-99a1-1c498fefd6c2_2054x1466.png 424w, https://substackcdn.com/image/fetch/$s_!USZ5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735b862b-b37e-4817-99a1-1c498fefd6c2_2054x1466.png 848w, https://substackcdn.com/image/fetch/$s_!USZ5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735b862b-b37e-4817-99a1-1c498fefd6c2_2054x1466.png 1272w, https://substackcdn.com/image/fetch/$s_!USZ5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735b862b-b37e-4817-99a1-1c498fefd6c2_2054x1466.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!USZ5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735b862b-b37e-4817-99a1-1c498fefd6c2_2054x1466.png" width="1456" height="1039" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/735b862b-b37e-4817-99a1-1c498fefd6c2_2054x1466.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1039,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:101184,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/199798126?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735b862b-b37e-4817-99a1-1c498fefd6c2_2054x1466.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!USZ5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735b862b-b37e-4817-99a1-1c498fefd6c2_2054x1466.png 424w, https://substackcdn.com/image/fetch/$s_!USZ5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735b862b-b37e-4817-99a1-1c498fefd6c2_2054x1466.png 848w, https://substackcdn.com/image/fetch/$s_!USZ5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735b862b-b37e-4817-99a1-1c498fefd6c2_2054x1466.png 1272w, https://substackcdn.com/image/fetch/$s_!USZ5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F735b862b-b37e-4817-99a1-1c498fefd6c2_2054x1466.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Then they modify the prompt or the system architecture, run the simulator again, and check whether the pass rate climbed. If it did, they would keep going. If it didn&#8217;t, they try something else. They repeat this cycle until the pass rate hits their exit criteria, and then they deploy with confidence that the change actually works.</p><p style="text-align: justify;">The graph below shows the pass rate for no-hallucination evaluation over time</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cKsW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb6dd70f-ae10-40a8-84d1-cf5046bbbf48_2514x1398.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cKsW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb6dd70f-ae10-40a8-84d1-cf5046bbbf48_2514x1398.png 424w, https://substackcdn.com/image/fetch/$s_!cKsW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb6dd70f-ae10-40a8-84d1-cf5046bbbf48_2514x1398.png 848w, https://substackcdn.com/image/fetch/$s_!cKsW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb6dd70f-ae10-40a8-84d1-cf5046bbbf48_2514x1398.png 1272w, https://substackcdn.com/image/fetch/$s_!cKsW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb6dd70f-ae10-40a8-84d1-cf5046bbbf48_2514x1398.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cKsW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb6dd70f-ae10-40a8-84d1-cf5046bbbf48_2514x1398.png" width="1456" height="810" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eb6dd70f-ae10-40a8-84d1-cf5046bbbf48_2514x1398.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:810,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:235835,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/199798126?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb6dd70f-ae10-40a8-84d1-cf5046bbbf48_2514x1398.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cKsW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb6dd70f-ae10-40a8-84d1-cf5046bbbf48_2514x1398.png 424w, https://substackcdn.com/image/fetch/$s_!cKsW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb6dd70f-ae10-40a8-84d1-cf5046bbbf48_2514x1398.png 848w, https://substackcdn.com/image/fetch/$s_!cKsW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb6dd70f-ae10-40a8-84d1-cf5046bbbf48_2514x1398.png 1272w, https://substackcdn.com/image/fetch/$s_!cKsW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb6dd70f-ae10-40a8-84d1-cf5046bbbf48_2514x1398.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://careersatdoordash.com/blog/doordash-simulation-evaluation-flywheel-to-develop-llm-chatbots-at-scale/">DoorDash Engineering Blog</a></figcaption></figure></div><p style="text-align: justify;">The speed of this loop makes this a powerful approach. DoorDash can run more than 200 simulated conversations in under five minutes and get automated evaluation results immediately.</p><p style="text-align: justify;">In other words, what used to take days of manual testing and review now takes hours. And because everything happens offline, they never risk degrading the experience for real customers while they iterate.</p><p style="text-align: justify;">Their evaluation suite has grown to more than 50 evaluations covering hallucination detection, tone assessment, issue classification, and other quality dimensions. Before any change goes to production, it must pass the full suite, which serves as both a quality check and a regression test.</p><p style="text-align: justify;">The flywheel sounds straightforward, but both the simulator and the evaluator required solving genuinely hard problems.</p><div><hr></div><h2><a href="https://go.bytebytego.com/Unleashed_060126">FeatureOps Summit 2026 - Feature management in the AI Era (Sponsored)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/Unleashed_060126" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xQ3q!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7fe5105-3674-489a-ba00-e9e871fe1b21_1200x1200.png 424w, https://substackcdn.com/image/fetch/$s_!xQ3q!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7fe5105-3674-489a-ba00-e9e871fe1b21_1200x1200.png 848w, https://substackcdn.com/image/fetch/$s_!xQ3q!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7fe5105-3674-489a-ba00-e9e871fe1b21_1200x1200.png 1272w, https://substackcdn.com/image/fetch/$s_!xQ3q!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7fe5105-3674-489a-ba00-e9e871fe1b21_1200x1200.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xQ3q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7fe5105-3674-489a-ba00-e9e871fe1b21_1200x1200.png" width="1200" height="1200" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d7fe5105-3674-489a-ba00-e9e871fe1b21_1200x1200.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1200,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:256380,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://go.bytebytego.com/Unleashed_060126&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198890332?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7fe5105-3674-489a-ba00-e9e871fe1b21_1200x1200.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!xQ3q!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7fe5105-3674-489a-ba00-e9e871fe1b21_1200x1200.png 424w, https://substackcdn.com/image/fetch/$s_!xQ3q!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7fe5105-3674-489a-ba00-e9e871fe1b21_1200x1200.png 848w, https://substackcdn.com/image/fetch/$s_!xQ3q!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7fe5105-3674-489a-ba00-e9e871fe1b21_1200x1200.png 1272w, https://substackcdn.com/image/fetch/$s_!xQ3q!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7fe5105-3674-489a-ba00-e9e871fe1b21_1200x1200.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Speed without control is a false economy. As AI code-generation accelerates software delivery, the <strong>FeatureOps Summit 2026</strong> is here to ensure that when we ship more, we break less.This premier virtual event brings together engineers, architects, and product leaders to explore the infrastructure of fearless delivery.</p><p><strong>Key Themes:</strong></p><ul><li><p><strong>AI Safety Nets:</strong> Guardrails for the flood of automated code.</p></li><li><p><strong>Edge Resilience:</strong> Sub-millisecond evaluation at scale.</p></li><li><p><strong>Continuous Flow:</strong> Moving past the &#8220;fixed-release&#8221; mindset. Register today to master the tools and patterns required for a fail-safe release environment.</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/Unleashed_060126&quot;,&quot;text&quot;:&quot;Register Today&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://go.bytebytego.com/Unleashed_060126"><span>Register Today</span></a></p><div><hr></div><h2 style="text-align: justify;">Simulating Customers That Push Back</h2><p style="text-align: justify;">A static test case can check whether the chatbot gives a reasonable answer to a single message, but it can&#8217;t capture what happens when a frustrated customer pushes back three times, provides additional information mid-conversation, or threatens to escalate.</p><p style="text-align: justify;">DoorDash&#8217;s simulator doesn&#8217;t use scripted messages at all.</p><p style="text-align: justify;">Instead, it uses an LLM to play the customer role, generating dynamic responses based on detailed test scenarios. At each turn, the simulator runs through a structured analysis, asking questions such as:</p><ul><li><p style="text-align: justify;">Was the issue addressed?</p></li><li><p style="text-align: justify;">Is the conversation making progress?</p></li><li><p style="text-align: justify;">Does the customer need to provide more information?</p></li><li><p style="text-align: justify;">Is the conversation going in circles?</p></li></ul><p style="text-align: justify;">Based on this analysis, it decides what a realistic customer would say next.</p><p style="text-align: justify;">The test scenarios themselves come from real historical support transcripts, not from engineers imagining what customers might say.</p><p style="text-align: justify;">LLMs analyze past conversations from DoorDash&#8217;s database and extract structured behavioral profiles, including the customer&#8217;s personality traits (frustrated and demanding versus confused and patient), a detailed narrative of the situation, and the specific outcome the customer is seeking. This grounds the simulator in actual customer behavior rather than idealized test cases.</p><p style="text-align: justify;">See the diagram below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!L2Ta!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5832df44-5f71-4dcf-b4e9-6f38f771758d_2054x1852.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!L2Ta!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5832df44-5f71-4dcf-b4e9-6f38f771758d_2054x1852.png 424w, https://substackcdn.com/image/fetch/$s_!L2Ta!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5832df44-5f71-4dcf-b4e9-6f38f771758d_2054x1852.png 848w, https://substackcdn.com/image/fetch/$s_!L2Ta!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5832df44-5f71-4dcf-b4e9-6f38f771758d_2054x1852.png 1272w, https://substackcdn.com/image/fetch/$s_!L2Ta!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5832df44-5f71-4dcf-b4e9-6f38f771758d_2054x1852.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!L2Ta!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5832df44-5f71-4dcf-b4e9-6f38f771758d_2054x1852.png" width="1456" height="1313" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5832df44-5f71-4dcf-b4e9-6f38f771758d_2054x1852.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1313,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:138924,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/199798126?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5832df44-5f71-4dcf-b4e9-6f38f771758d_2054x1852.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!L2Ta!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5832df44-5f71-4dcf-b4e9-6f38f771758d_2054x1852.png 424w, https://substackcdn.com/image/fetch/$s_!L2Ta!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5832df44-5f71-4dcf-b4e9-6f38f771758d_2054x1852.png 848w, https://substackcdn.com/image/fetch/$s_!L2Ta!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5832df44-5f71-4dcf-b4e9-6f38f771758d_2054x1852.png 1272w, https://substackcdn.com/image/fetch/$s_!L2Ta!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5832df44-5f71-4dcf-b4e9-6f38f771758d_2054x1852.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The simulator also exhibits realistic escalation patterns. It doesn&#8217;t immediately ask for a manager. Instead, it gives the chatbot multiple chances to resolve the issue, only escalating after repeated unhelpfulness or circular exchanges, and re-engaging when progress becomes clear again. This mirrors how real customers behave.</p><p style="text-align: justify;">For a simulated conversation to be meaningful, the chatbot also needs realistic backend data. It needs to look up delivery status, check refund eligibility, and pull order details. DoorDash handles this through mock data that blends real production data with scenario-specific test data, preserving timestamps and relationships to keep interactions realistic. This allows them to test complex edge cases, including fraud scenarios and high-value refunds, that their previous testing infrastructure couldn&#8217;t handle.</p><h2 style="text-align: justify;">Using an LLM to Judge Another LLM</h2><p style="text-align: justify;">Running hundreds of realistic conversations is only useful if you can tell whether the chatbot actually handled them well. However, manually reading through every simulated conversation would defeat the entire purpose of automation. So DoorDash uses an LLM to evaluate the chatbot&#8217;s performance automatically.</p><p style="text-align: justify;">Each evaluation is structured as a function that takes the full conversation transcript (including tool calls and backend responses) along with the relevant company policy, applies a prompt asking whether the chatbot correctly followed that policy, and returns a binary pass or fail with the reasoning behind the judgment.</p><p style="text-align: justify;">The obvious objection here is that this sounds circular. If an LLM caused the problem by hallucinating, why would you trust another LLM to catch the hallucination?</p><p style="text-align: justify;">DoorDash addresses this directly with a concept they call the generator-verifier gap. Acting as a full customer support agent involves complex, multi-step decision-making across a huge range of possible scenarios. That&#8217;s genuinely hard. But verifying a single, narrowly-defined behavior is a much simpler task.</p><p style="text-align: justify;">For example, &#8220;Did the chatbot claim the customer was eligible for a refund when the policy says otherwise?&#8221; is a straightforward binary question. The evaluator isn&#8217;t trying to be a better support agent. It&#8217;s checking one specific thing at a time, and LLMs are much more reliable at these focused binary judgments than they are at open-ended generation.</p><p style="text-align: justify;">But DoorDash doesn&#8217;t just trust the LLM judge out of the box. They calibrate it against human judgment through a structured process. They collect a sample of conversations, have human experts label each one as pass or fail, run the LLM judge on the same samples, and then measure how often the judge agrees with the humans and how often it misses problems or flags false ones. They analyze the reasoning behind any mismatches, revise the evaluation prompt to fix systematic errors, and repeat until the judge reliably matches human expert judgment. This calibration step creates trust in the system.</p><p style="text-align: justify;">The binary nature of the evaluations is important here. DoorDash isn&#8217;t asking the LLM to rate the chatbot&#8217;s performance on a subjective scale of 1 to 10. They&#8217;re asking whether the chatbot followed a specific policy or not. It makes calibration faster, makes disagreements easier to diagnose, and produces more reliable judgments.</p><h2>Fixing Hallucinations by Giving the Chatbot Less Information</h2><p style="text-align: justify;">With the simulator generating conversations and the evaluator grading them, DoorDash had a working flywheel.</p><p style="text-align: justify;">During early launches, human reviewers noticed the chatbot was getting overwhelmed by the sheer volume of data in its context window. Order histories, delivery status updates, refund decisions, and tool call results were all being fed directly to the model as raw event logs. The chatbot would misinterpret a field or suggest a policy that didn&#8217;t exist, not because the information was wrong, but because there was too much of it. This runs directly counter to the intuition that giving a model more information should produce better results.</p><p style="text-align: justify;">DoorDash hypothesized that the same data that was vital for the chatbot&#8217;s reasoning was becoming noise when it came time to generate a response to the customer. Their solution was an architectural layer they called the &#8220;case state,&#8221; which synthesizes the raw tool history into a structured, intermediate representation. Instead of dumping everything into the context window, the case state distills the relevant facts into a clean format that the chatbot can actually use.</p><p style="text-align: justify;">Getting the case state right required the flywheel. Their first attempts at extraction logic didn&#8217;t work well at all. Some versions left out critical information, causing the chatbot to miss details that were essential for driving resolutions. Other versions remained too noisy or poorly structured, confusing the model in different ways. Since the simulator could generate numerous realistic conversations in minutes, the team experimented with dozens of different context shapes and prompt strategies in a rapid feedback loop. Each iteration took hours instead of the weeks it would have required through manual testing.</p><p style="text-align: justify;">Over 11 iterations, the hallucination evaluation pass rate climbed steadily upward, with a notable dip at iteration 3, where a change actually made things temporarily worse. That dip shows that improvement isn&#8217;t linear, even with a flywheel, and that part of the flywheel&#8217;s value is catching regressions before they reach real customers.</p><p style="text-align: justify;">The final result was a 90% reduction in hallucinations in simulation, and that improvement carried over into production. The strong correlation between their offline metrics and live traffic performance gave the team confidence that the flywheel is a reliable development tool, not just an internal sandbox disconnected from reality.</p><h1 style="text-align: justify;">Conclusion</h1><p style="text-align: justify;">The simulation and evaluation flywheel has fundamentally changed how DoorDash develops and deploys chatbot improvements, compressing iteration cycles from days to hours and giving them a way to validate changes across hundreds of scenarios before any real customer is affected.</p><p style="text-align: justify;">However, the flywheel does come with real tradeoffs worth understanding.</p><p style="text-align: justify;">The main limitation is that it can only catch problems for which you&#8217;ve written evaluations. If a failure mode isn&#8217;t captured by an evaluation, the flywheel is blind to it. DoorDash mitigates this by running a full evaluation suite before every deployment, covering hallucination, tone, and issue classification, but new failure modes can always emerge that existing evaluations don&#8217;t cover. This is why human review remains the starting point for every improvement cycle. Despite all the automation, someone still has to look at real conversations and notice what&#8217;s going wrong.</p><p style="text-align: justify;">Simulation fidelity is another inherent limitation. Even with transcript-derived scenarios and hybrid mock data, synthetic conversations are approximations of real user behavior. DoorDash reports a strong correlation between its offline metrics and production results, which validates the approach, but that correlation isn&#8217;t guaranteed to hold for every type of scenario or every kind of system change.</p><p style="text-align: justify;">There&#8217;s also the question of cost. Running hundreds of LLM-to-LLM conversations per test cycle, plus LLM-as-judge evaluations on each one, requires significant compute. For smaller teams or less critical applications, a lighter-weight version with fewer scenarios and more targeted evaluations might be the pragmatic starting point.</p><p style="text-align: justify;">The broader takeaway is that LLM systems require a completely different testing paradigm than traditional software. Since we can&#8217;t trace the branch anymore, we need a feedback loop that lets us simulate, evaluate, and iterate fast enough to build confidence before shipping.</p><p style="text-align: justify;"><strong>References:</strong></p><ul><li><p style="text-align: justify;"><a href="https://careersatdoordash.com/blog/doordash-simulation-evaluation-flywheel-to-develop-llm-chatbots-at-scale/">A Simulation and Evaluation Flywheel to build LLM Chatbots at Scale</a></p></li><li><p style="text-align: justify;"><a href="https://en.wikipedia.org/wiki/LLM-as-a-Judge">LLM as a Judge Pattern</a></p></li></ul>]]></content:encoded></item><item><title><![CDATA[Must-Know Failure Modes in Distributed Systems]]></title><description><![CDATA[In this article, we will look at the most significant failure mode patterns in distributed systems and the standard approaches to deal with each of them.]]></description><link>https://blog.bytebytego.com/p/must-know-failure-modes-in-distributed</link><guid isPermaLink="false">https://blog.bytebytego.com/p/must-know-failure-modes-in-distributed</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Thu, 28 May 2026 16:31:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!VDoG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc1176a-e45f-4b31-b860-38cb99c198bd_2250x2624.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p style="text-align: justify;">What does it mean for a distributed system to be up?</p><p style="text-align: justify;">On a single machine, the answer is straightforward, since a program is either running or it has crashed, and the line between the two is usually obvious from a stack trace.</p><p style="text-align: justify;">Distributed systems are not so simple. Every server can report healthy while users are seeing errors, the whole system can be technically working but stuck in a state it cannot recover from on its own, and it can quietly serve wrong data while every dashboard glows green.</p><p style="text-align: justify;">None of these may be because of bugs in the conventional sense. They are recurring failure patterns that have been showing up across systems for decades, with names, mechanisms, and standard ways of defending against them.</p><p style="text-align: justify;">In this article, we will look at the most significant failure mode patterns in distributed systems and the standard approaches to deal with each of them.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VDoG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc1176a-e45f-4b31-b860-38cb99c198bd_2250x2624.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VDoG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc1176a-e45f-4b31-b860-38cb99c198bd_2250x2624.png 424w, https://substackcdn.com/image/fetch/$s_!VDoG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc1176a-e45f-4b31-b860-38cb99c198bd_2250x2624.png 848w, https://substackcdn.com/image/fetch/$s_!VDoG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc1176a-e45f-4b31-b860-38cb99c198bd_2250x2624.png 1272w, https://substackcdn.com/image/fetch/$s_!VDoG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc1176a-e45f-4b31-b860-38cb99c198bd_2250x2624.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VDoG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc1176a-e45f-4b31-b860-38cb99c198bd_2250x2624.png" width="1456" height="1698" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4cc1176a-e45f-4b31-b860-38cb99c198bd_2250x2624.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1698,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:431563,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675526?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc1176a-e45f-4b31-b860-38cb99c198bd_2250x2624.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VDoG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc1176a-e45f-4b31-b860-38cb99c198bd_2250x2624.png 424w, https://substackcdn.com/image/fetch/$s_!VDoG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc1176a-e45f-4b31-b860-38cb99c198bd_2250x2624.png 848w, https://substackcdn.com/image/fetch/$s_!VDoG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc1176a-e45f-4b31-b860-38cb99c198bd_2250x2624.png 1272w, https://substackcdn.com/image/fetch/$s_!VDoG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cc1176a-e45f-4b31-b860-38cb99c198bd_2250x2624.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Why Failures in Distributed Systems Are Different</h2>
      <p>
          <a href="https://blog.bytebytego.com/p/must-know-failure-modes-in-distributed">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[How Airtable Built the Search Layer Behind Their AI Features]]></title><description><![CDATA[In this article, we will look at how Airtable&#8217;s data infrastructure team built its architecture, the challenges they faced, the tradeoffs they accepted, and why the choices they made only make sense once their data is properly understood.]]></description><link>https://blog.bytebytego.com/p/how-airtable-built-the-search-layer</link><guid isPermaLink="false">https://blog.bytebytego.com/p/how-airtable-built-the-search-layer</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Wed, 27 May 2026 15:30:43 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!HKve!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5a8e4b-1a40-4586-98a0-201558e6bc18_2112x1900.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><a href="https://go.bytebytego.com/WorkOS_052726">WorkOS launches auth.md - an open protocol for agent registration (Sponsored)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/WorkOS_052726" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!h5lV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb614ba2-3051-4017-81df-cf51c4fe6e26_2386x1310.png 424w, https://substackcdn.com/image/fetch/$s_!h5lV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb614ba2-3051-4017-81df-cf51c4fe6e26_2386x1310.png 848w, https://substackcdn.com/image/fetch/$s_!h5lV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb614ba2-3051-4017-81df-cf51c4fe6e26_2386x1310.png 1272w, https://substackcdn.com/image/fetch/$s_!h5lV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb614ba2-3051-4017-81df-cf51c4fe6e26_2386x1310.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!h5lV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb614ba2-3051-4017-81df-cf51c4fe6e26_2386x1310.png" width="1456" height="799" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bb614ba2-3051-4017-81df-cf51c4fe6e26_2386x1310.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:799,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:448246,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://go.bytebytego.com/WorkOS_052726&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675818?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb614ba2-3051-4017-81df-cf51c4fe6e26_2386x1310.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!h5lV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb614ba2-3051-4017-81df-cf51c4fe6e26_2386x1310.png 424w, https://substackcdn.com/image/fetch/$s_!h5lV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb614ba2-3051-4017-81df-cf51c4fe6e26_2386x1310.png 848w, https://substackcdn.com/image/fetch/$s_!h5lV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb614ba2-3051-4017-81df-cf51c4fe6e26_2386x1310.png 1272w, https://substackcdn.com/image/fetch/$s_!h5lV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbb614ba2-3051-4017-81df-cf51c4fe6e26_2386x1310.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Sign-up forms were built for humans in browsers, so how do AI agents programmatically register with services?</p><p>Enter <strong>auth.md.</strong> By exposing a single, machine-readable Markdown file at your service root, AI agents can dynamically discover your OAuth Protected Resource Metadata, parse required scopes, and authenticate seamlessly.</p><p>With native support in WorkOS AuthKit, you can now implement this protocol out of the box, giving AI tools a standardized, secure way to log into your application.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/WorkOS_052726&quot;,&quot;text&quot;:&quot;Read the auth.md docs&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://go.bytebytego.com/WorkOS_052726"><span>Read the auth.md docs</span></a></p><div><hr></div><p style="text-align: justify;">Airtable holds embeddings for hundreds of thousands of customer databases, and on any given week, roughly three-quarters of them sit completely idle. This fact, more than any algorithm or vendor choice, decided the architecture behind their semantic search system. The interesting story is not which vector database they picked. It is how one peculiar property of their data forced a specific chain of engineering decisions, each one logical only in light of the one before it.</p><p style="text-align: justify;">Airtable is a platform where customers build their own database-like applications, organized into &#8220;bases&#8221; that often hold hundreds of thousands of rows. Their AI feature, called Omni, lets users ask natural-language questions of their data and get answers back in plain English. A separate feature, linked record recommendations, suggests relationships between rows based on meaning rather than exact text matches. Both features depend on the same underlying capability, which is finding the rows in a base that are semantically relevant to a user&#8217;s intent.</p><p style="text-align: justify;">This might sound simple until scale enters the picture. When a base has half a million rows, fitting all of them into a single LLM prompt becomes infeasible. The model has limits on how much context it can absorb, and even if those limits did not exist, sending that much data on every query would be slow and expensive. The system has to find the most relevant rows fast, then hand those rows to the LLM as context.</p><p style="text-align: justify;">In this article, we will look at how Airtable&#8217;s data infrastructure team built its architecture, the challenges they faced, the tradeoffs they accepted, and why the choices they made only make sense once their data is properly understood.</p><p style="text-align: justify;"><em>Disclaimer: This post is based on publicly shared details from the Airtable Engineering Team. Please comment if you notice any inaccuracies.</em></p><h2 style="text-align: justify;">The Data and the Constraints</h2><p style="text-align: justify;">The Airtable team anchored their work around four design priorities:</p><ul><li><p style="text-align: justify;">Queries had to return within 500 milliseconds at the 99th percentile, which means the slowest 1 percent of queries still had to come back within that window. Anything slower would make the AI features feel sluggish.</p></li></ul><ul><li><p style="text-align: justify;">Writes had to be high-throughput since customer data changes constantly, and embeddings have to keep pace.</p></li><li><p style="text-align: justify;">The system had to scale horizontally to support millions of independent bases.</p></li><li><p style="text-align: justify;">Everything had to be self-hosted because customer data privacy required keeping it all inside Airtable-controlled infrastructure.</p></li></ul><p style="text-align: justify;">Beyond those priorities, Airtable&#8217;s data has three properties worth flagging early:</p><ul><li><p style="text-align: justify;">Customer bases vary enormously in size, with some holding a handful of rows and others holding hundreds of thousands.</p></li><li><p style="text-align: justify;">Each base is isolated, meaning one customer&#8217;s data must never leak into another customer&#8217;s results.</p></li><li><p style="text-align: justify;">Most bases are idle most of the time, a fact that becomes important in a later section.</p></li></ul><p style="text-align: justify;">Before going further, we need to understand what an embedding is.</p><p style="text-align: justify;">An embedding is a list of numbers, typically several hundred or a thousand of them, generated by a neural network. The network is trained so that two pieces of text with similar meanings produce numerically close vectors. An embedding can be thought of as a fingerprint of meaning, where similarity in the numbers reflects similarity in what the text says.</p><p style="text-align: justify;">One important practical fact is that embeddings are typically about ten times the size of the original data they represent, which is why Airtable cannot just store them alongside the source rows in their primary database. A separate system is needed, one designed specifically for storing and searching across these large numerical vectors.</p><p style="text-align: justify;">The asynchronous embedding pipeline that generates and updates these vectors as customer data changes is a separate system, which is the database that stores the embeddings and serves queries against them. After evaluating the landscape in late 2024, Airtable selected Milvus as its database. This is because Milvus supported self-hosting, handled multi-tenancy through its partition model, and let them scale ingestion, indexing, and query execution as separate components. Picking Milvus, though, was the easy part. The hard part was figuring out how to organize Airtable&#8217;s data inside it.</p><p style="text-align: justify;">See the diagram below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AB2i!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e2d47c3-a61a-4cb5-9668-512a3eccc8fc_2284x1226.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AB2i!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e2d47c3-a61a-4cb5-9668-512a3eccc8fc_2284x1226.png 424w, https://substackcdn.com/image/fetch/$s_!AB2i!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e2d47c3-a61a-4cb5-9668-512a3eccc8fc_2284x1226.png 848w, https://substackcdn.com/image/fetch/$s_!AB2i!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e2d47c3-a61a-4cb5-9668-512a3eccc8fc_2284x1226.png 1272w, https://substackcdn.com/image/fetch/$s_!AB2i!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e2d47c3-a61a-4cb5-9668-512a3eccc8fc_2284x1226.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AB2i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e2d47c3-a61a-4cb5-9668-512a3eccc8fc_2284x1226.png" width="1456" height="782" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0e2d47c3-a61a-4cb5-9668-512a3eccc8fc_2284x1226.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:782,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:138669,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675818?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e2d47c3-a61a-4cb5-9668-512a3eccc8fc_2284x1226.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AB2i!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e2d47c3-a61a-4cb5-9668-512a3eccc8fc_2284x1226.png 424w, https://substackcdn.com/image/fetch/$s_!AB2i!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e2d47c3-a61a-4cb5-9668-512a3eccc8fc_2284x1226.png 848w, https://substackcdn.com/image/fetch/$s_!AB2i!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e2d47c3-a61a-4cb5-9668-512a3eccc8fc_2284x1226.png 1272w, https://substackcdn.com/image/fetch/$s_!AB2i!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0e2d47c3-a61a-4cb5-9668-512a3eccc8fc_2284x1226.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2 style="text-align: justify;">Partitioning Strategy</h2><p style="text-align: justify;">The first real architectural question was how to slice up customer data so that millions of bases can coexist in one system without leaking into each other.</p><p style="text-align: justify;">Two options were on the table.</p><p style="text-align: justify;">The first option of shared partitions would put many bases together in the same physical slice and rely on a customer ID filter at query time to keep results separate. This approach uses resources efficiently because there is no partition for every customer, and small bases do not sit around taking up dedicated storage. The cost is that every query carries the overhead of filtering by customer ID, and deleting a customer&#8217;s data becomes complicated because the rows are scattered across shared partitions.</p><p style="text-align: justify;">The second option of having one partition per base gives each customer their own physical slice. Queries are naturally isolated because they only ever touch one partition. Deletion is trivial since dropping the partition is enough. The cost is operational. With millions of customers, the database ends up managing millions of partitions, which puts pressure on its internal bookkeeping.</p><p style="text-align: justify;">Airtable picked the second option. The reasoning was that strong physical isolation made permission boundaries obvious, deletion stayed simple, and queries avoided the latency cost of post-query filtering.</p><p style="text-align: justify;">Then the team ran into a problem.</p><p style="text-align: justify;">At around 100,000 partitions inside a single Milvus collection, performance fell off a cliff. Partition creation latency went from about 20 milliseconds to roughly 250 milliseconds. Loading a partition started taking more than 30 seconds. Adding hardware would not have fixed any of this, because the issue was not a shortage of capacity. The issue was that too many partitions in one collection overwhelmed the bookkeeping that the database needed to keep them organized.</p><p style="text-align: justify;">The fix was hierarchical capping.</p><p style="text-align: justify;">Each Milvus cluster now holds 400 collections, and each collection holds at most 1,000 partitions, which limits any single cluster to 400,000 bases. As the customer base grows, Airtable provisions new clusters rather than packing more partitions into existing ones.</p><p style="text-align: justify;">The structure trades some operational complexity for predictable performance at every layer. See the diagram below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HKve!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5a8e4b-1a40-4586-98a0-201558e6bc18_2112x1900.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HKve!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5a8e4b-1a40-4586-98a0-201558e6bc18_2112x1900.png 424w, https://substackcdn.com/image/fetch/$s_!HKve!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5a8e4b-1a40-4586-98a0-201558e6bc18_2112x1900.png 848w, https://substackcdn.com/image/fetch/$s_!HKve!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5a8e4b-1a40-4586-98a0-201558e6bc18_2112x1900.png 1272w, https://substackcdn.com/image/fetch/$s_!HKve!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5a8e4b-1a40-4586-98a0-201558e6bc18_2112x1900.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HKve!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5a8e4b-1a40-4586-98a0-201558e6bc18_2112x1900.png" width="1456" height="1310" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2a5a8e4b-1a40-4586-98a0-201558e6bc18_2112x1900.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1310,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:195717,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675818?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5a8e4b-1a40-4586-98a0-201558e6bc18_2112x1900.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HKve!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5a8e4b-1a40-4586-98a0-201558e6bc18_2112x1900.png 424w, https://substackcdn.com/image/fetch/$s_!HKve!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5a8e4b-1a40-4586-98a0-201558e6bc18_2112x1900.png 848w, https://substackcdn.com/image/fetch/$s_!HKve!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5a8e4b-1a40-4586-98a0-201558e6bc18_2112x1900.png 1272w, https://substackcdn.com/image/fetch/$s_!HKve!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2a5a8e4b-1a40-4586-98a0-201558e6bc18_2112x1900.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Permissions deserve a brief discussion before we move further. Milvus does not know anything about who is allowed to see what data. It just stores embeddings and returns matches. Permission checks happen later, when the application layer takes the row IDs returned by Milvus and fetches the actual rows from Airtable&#8217;s primary database. This split keeps the vector search system focused on a single job, which is similarity search, and authorization stays where authorization always has lived.</p><p style="text-align: justify;">The pattern of hierarchical capping shows up across distributed systems, from sharded relational databases to message broker topics. Any flat namespace eventually hits a wall, and the fix is almost always to introduce another level of grouping above it. Recognizing this principle is more transferable than memorizing the specific numbers.</p><p style="text-align: justify;">See the diagram below that shows the query flow:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jnv3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecc858c6-5b49-41ab-8ac5-60b0d7f94a93_2216x1226.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jnv3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecc858c6-5b49-41ab-8ac5-60b0d7f94a93_2216x1226.png 424w, https://substackcdn.com/image/fetch/$s_!jnv3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecc858c6-5b49-41ab-8ac5-60b0d7f94a93_2216x1226.png 848w, https://substackcdn.com/image/fetch/$s_!jnv3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecc858c6-5b49-41ab-8ac5-60b0d7f94a93_2216x1226.png 1272w, https://substackcdn.com/image/fetch/$s_!jnv3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecc858c6-5b49-41ab-8ac5-60b0d7f94a93_2216x1226.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jnv3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecc858c6-5b49-41ab-8ac5-60b0d7f94a93_2216x1226.png" width="1456" height="806" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ecc858c6-5b49-41ab-8ac5-60b0d7f94a93_2216x1226.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:806,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:115189,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675818?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecc858c6-5b49-41ab-8ac5-60b0d7f94a93_2216x1226.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jnv3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecc858c6-5b49-41ab-8ac5-60b0d7f94a93_2216x1226.png 424w, https://substackcdn.com/image/fetch/$s_!jnv3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecc858c6-5b49-41ab-8ac5-60b0d7f94a93_2216x1226.png 848w, https://substackcdn.com/image/fetch/$s_!jnv3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecc858c6-5b49-41ab-8ac5-60b0d7f94a93_2216x1226.png 1272w, https://substackcdn.com/image/fetch/$s_!jnv3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecc858c6-5b49-41ab-8ac5-60b0d7f94a93_2216x1226.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Once the data has been sliced up, the next question is how to actually search inside each slice.</p><h2 style="text-align: justify;">Index Selection</h2><p style="text-align: justify;">Vector search at scale involves an unavoidable tradeoff with three currencies, namely memory, latency, and recall.</p><p style="text-align: justify;">Recall means the percentage of truly relevant results that show up in a query response. Every vector index pays for performance with one of these three currencies, and no option gets all three for free.</p><p style="text-align: justify;">Airtable benchmarked three index types, and the results map cleanly onto this triangle.</p><p style="text-align: justify;">HNSW, which stands for Hierarchical Navigable Small World, builds a graph where similar vectors are connected to each other. A query starts at a small set of entry points near the top of the graph and follows the connections downward, hopping from one vector to its nearest neighbors until it converges on the closest match. HNSW is fast at lookup time, achieves recall in the 99 to 100 percent range, and behaves predictably under load. The cost is that the entire graph has to live in memory, which makes HNSW the most memory-hungry of the three options.</p><p style="text-align: justify;">IVF-SQ8 takes a different approach. The IVF part clusters vectors into groups, so a query only has to search inside the most relevant group rather than the full dataset. The SQ8 part compresses each number in the vector from four bytes to one byte, shrinking the index dramatically. The footprint becomes much smaller, but the compression introduces approximation error that lowers recall.</p><p style="text-align: justify;">DiskANN keeps most of the index on solid-state storage rather than in memory. It scales to enormous datasets per node because holding everything in RAM is not required. The cost is that every query touches disk, and disk is slower than memory, so query latency rises.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_yr5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ba5d15c-70f3-49bd-8b57-129c63a6f510_2230x1772.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_yr5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ba5d15c-70f3-49bd-8b57-129c63a6f510_2230x1772.png 424w, https://substackcdn.com/image/fetch/$s_!_yr5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ba5d15c-70f3-49bd-8b57-129c63a6f510_2230x1772.png 848w, https://substackcdn.com/image/fetch/$s_!_yr5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ba5d15c-70f3-49bd-8b57-129c63a6f510_2230x1772.png 1272w, https://substackcdn.com/image/fetch/$s_!_yr5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ba5d15c-70f3-49bd-8b57-129c63a6f510_2230x1772.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_yr5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ba5d15c-70f3-49bd-8b57-129c63a6f510_2230x1772.png" width="1456" height="1157" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2ba5d15c-70f3-49bd-8b57-129c63a6f510_2230x1772.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1157,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:179910,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675818?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ba5d15c-70f3-49bd-8b57-129c63a6f510_2230x1772.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_yr5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ba5d15c-70f3-49bd-8b57-129c63a6f510_2230x1772.png 424w, https://substackcdn.com/image/fetch/$s_!_yr5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ba5d15c-70f3-49bd-8b57-129c63a6f510_2230x1772.png 848w, https://substackcdn.com/image/fetch/$s_!_yr5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ba5d15c-70f3-49bd-8b57-129c63a6f510_2230x1772.png 1272w, https://substackcdn.com/image/fetch/$s_!_yr5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2ba5d15c-70f3-49bd-8b57-129c63a6f510_2230x1772.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Airtable chose HNSW. Given the priorities from earlier in the design, this was almost the only available answer. A 500-millisecond latency target ruled out DiskANN&#8217;s higher per-query cost. The recall directly determines how good Omni&#8217;s responses feel to users, which makes the precision of HNSW worth paying for. The memory cost remained a real concern, but Airtable had a separate way to handle it.</p><p>The right index does not exist in the abstract. It exists relative to the priorities and constraints of a specific system. If Airtable&#8217;s latency tolerance had been looser, DiskANN would have been an interesting candidate. If their recall tolerance had been lower, IVF-SQ8 would have saved them money. None of the three options is universally better than the others.</p><p style="text-align: justify;">This same triangular pattern repeats across systems engineering. Caching works the same way, where hit rate trades against memory and consistency. Database indexes work the same way, where read speed trades against write speed and storage. The technologies stop feeling intimidating once the underlying tradeoff becomes recognizable.</p><h2 style="text-align: justify;">Hot and Cold Data</h2><p style="text-align: justify;">Picking HNSW solved the latency and recall problem, but pushed the entire cost onto memory. Across hundreds of thousands of bases, that memory bill adds up quickly. The team needed a way to shrink it without giving up the index they had just chosen.</p><p style="text-align: justify;">The solution came from looking at how customers actually use Airtable. When the team analyzed access patterns, they found that only about 25 percent of bases were read from or written to in any given week. The other 75 percent sat completely idle. This was not an anomaly. It reflected something real about how people work. Users tend to focus intensively on one base for a stretch of time, set it aside for weeks or months, and then come back when the project requires their attention again.</p><p style="text-align: justify;">Milvus supports offloading partitions from memory to storage and reloading them within seconds. With that capability, Airtable could keep only the hot partitions in memory and push the cold ones out. When a user opens a base that has not been touched in weeks, the partition reloads quickly enough that the user notices a brief warm-up rather than a failure.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ivTm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbd37d3a-ccc6-408d-823a-3c79ec9574b2_1980x1478.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ivTm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbd37d3a-ccc6-408d-823a-3c79ec9574b2_1980x1478.png 424w, https://substackcdn.com/image/fetch/$s_!ivTm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbd37d3a-ccc6-408d-823a-3c79ec9574b2_1980x1478.png 848w, https://substackcdn.com/image/fetch/$s_!ivTm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbd37d3a-ccc6-408d-823a-3c79ec9574b2_1980x1478.png 1272w, https://substackcdn.com/image/fetch/$s_!ivTm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbd37d3a-ccc6-408d-823a-3c79ec9574b2_1980x1478.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ivTm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbd37d3a-ccc6-408d-823a-3c79ec9574b2_1980x1478.png" width="1456" height="1087" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bbd37d3a-ccc6-408d-823a-3c79ec9574b2_1980x1478.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1087,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:136461,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675818?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbd37d3a-ccc6-408d-823a-3c79ec9574b2_1980x1478.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ivTm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbd37d3a-ccc6-408d-823a-3c79ec9574b2_1980x1478.png 424w, https://substackcdn.com/image/fetch/$s_!ivTm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbd37d3a-ccc6-408d-823a-3c79ec9574b2_1980x1478.png 848w, https://substackcdn.com/image/fetch/$s_!ivTm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbd37d3a-ccc6-408d-823a-3c79ec9574b2_1980x1478.png 1272w, https://substackcdn.com/image/fetch/$s_!ivTm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbbd37d3a-ccc6-408d-823a-3c79ec9574b2_1980x1478.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">This approach works for Airtable specifically because their access pattern is bursty and bimodal. If usage were spread evenly across all bases, with every customer constantly touching their data at the same low rate, cold offloading would not save much. The hot set would be the entire dataset. Airtable&#8217;s pattern is the opposite. A small fraction of bases is active at any moment, and the active set rotates over time.</p><p style="text-align: justify;">What made this work was measurement.</p><p style="text-align: justify;">The Airtable engineering team did not guess about access patterns and did not reach for a generic optimization. They looked at the data, found a property of their actual usage, and built around it. The HNSW choice became economically viable because of this measurement, and the decisions in this system reinforce each other in a way that would not be obvious from evaluating any one of them in isolation.</p><h2 style="text-align: justify;">Recovery</h2><p style="text-align: justify;">The traditional approach to disaster recovery in databases is backup and restore. Snapshots get taken regularly, stored somewhere safe, and used to rebuild the system if something catastrophic happens. Airtable went a different direction.</p><p style="text-align: justify;">Their recovery path is to spin up a fresh Milvus cluster and re-embed customer data from the source. The most-used bases get re-embedded first so that most users see normal service quickly. The remaining bases get rebuilt lazily as customers access them. There is some compute cost during recovery and some delay before every base is fully back, but the path is conceptually simple and works across many failure modes at once. Corruption, model migrations, and certain data residency changes all reduce to the same procedure.</p><p style="text-align: justify;">This option is only available because Airtable has already built an asynchronous embedding pipeline as part of earlier work. That pipeline normally generates new embeddings whenever customer data changes, processing them in the background rather than blocking writes. Recovery is not a separate system created for emergencies. It is just the existing pipeline running against an empty cluster.</p><h2 style="text-align: justify;">Conclusion</h2><p style="text-align: justify;">The system built by Airtable involves four major tradeoffs: how to partition the data, which index to use, when to keep data in memory, and how to recover from failure.</p><p style="text-align: justify;">Every one of those decisions traces back to the same upstream fact about Airtable&#8217;s tenants. Their customers run small, isolated bases that are mostly cold most of the time. Changing any one of those properties can cause the design to fall apart.</p><p style="text-align: justify;">For example, a workload where every base is hot all the time would make cold offloading useless. A workload requiring strict consistency would not tolerate the asynchronous embedding pipeline. A workload with very small per-customer datasets might benefit more from shared partitions than from one-per-base.</p><p style="text-align: justify;">The technologies Airtable uses, including Milvus, HNSW, and the rest, are interchangeable in principle. The same system could be rebuilt on a different infrastructure, and the architectural reasoning would still hold. What is harder to replicate is the discipline of letting the data drive the architecture rather than the other way around.</p><p style="text-align: justify;"><strong>References:</strong></p><ul><li><p><a href="https://medium.com/airtable-eng/productionizing-semantic-search-how-we-built-and-scaled-vector-infrastructure-at-airtable-180fff11a136">Productionizing Semantic Search: How We Built and Scaled Vector Infrastructure at Airtable</a></p></li><li><p><a href="https://en.wikipedia.org/wiki/Airtable">What is Airtable?</a></p></li></ul>]]></content:encoded></item><item><title><![CDATA[How Vercel Cut Build Wait Times From 90 Seconds To 5]]></title><description><![CDATA[In this article, we examine the constraints Vercel faced, the choices they made in response, and the optimizations that produced the speedup.]]></description><link>https://blog.bytebytego.com/p/how-vercel-cut-build-wait-times-from</link><guid isPermaLink="false">https://blog.bytebytego.com/p/how-vercel-cut-build-wait-times-from</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Tue, 26 May 2026 15:31:10 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!4rvh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5280985-afc9-4b10-96b3-91807a497ad7_2010x1344.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><a href="https://go.bytebytego.com/GitLab_052626">GitLab Transcend is back, June 10, streaming live from London. (Sponsored)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/GitLab_052626" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Dw9P!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51fe6edb-8130-47ab-8fbc-debf97c06e3e_1100x577.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Dw9P!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51fe6edb-8130-47ab-8fbc-debf97c06e3e_1100x577.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Dw9P!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51fe6edb-8130-47ab-8fbc-debf97c06e3e_1100x577.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Dw9P!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51fe6edb-8130-47ab-8fbc-debf97c06e3e_1100x577.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Dw9P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51fe6edb-8130-47ab-8fbc-debf97c06e3e_1100x577.jpeg" width="1100" height="577" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/51fe6edb-8130-47ab-8fbc-debf97c06e3e_1100x577.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:577,&quot;width&quot;:1100,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:null,&quot;href&quot;:&quot;https://go.bytebytego.com/GitLab_052626&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Dw9P!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51fe6edb-8130-47ab-8fbc-debf97c06e3e_1100x577.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Dw9P!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51fe6edb-8130-47ab-8fbc-debf97c06e3e_1100x577.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Dw9P!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51fe6edb-8130-47ab-8fbc-debf97c06e3e_1100x577.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Dw9P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F51fe6edb-8130-47ab-8fbc-debf97c06e3e_1100x577.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On June 10, GitLab Transcend streams live from London with an agenda built for practitioners like you. You can expect an agenda that&#8217;s full of keyboard moments with live demos of Duo Agent Platform, agentic AI use cases from your peers, and The Developer Show hosted live by Senior Developer Advocate, Colleen Lake. Register today.</p><p>GitLab Transcend streams live from London on June 10 (with regional replays for APAC and AMER on June 11). Register for free today.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/GitLab_052626&quot;,&quot;text&quot;:&quot;Stream the event&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://go.bytebytego.com/GitLab_052626"><span>Stream the event</span></a></p><div><hr></div><p style="text-align: justify;">In November 2023, Vercel quietly shipped an internal platform that cut its build provisioning time from 90 seconds to 5. That sounds like a story about making things faster. It is, but only on the surface. The real story is that Vercel got faster by accepting a harder constraint, building a more complicated foundation, and then layering three separate optimizations on top of it. The 18x improvement is the result.</p><p style="text-align: justify;">Vercel is a deployment platform for web applications. When a developer pushes code to a connected repository, Vercel pulls that code, runs the build process (compiling, bundling assets, packaging the output) on its own servers, and then deploys the result to a global edge network of geographically distributed servers that deliver the site to end users. The build step happens on Vercel&#8217;s infrastructure, which means thousands of customers run their build scripts on machines that Vercel manages. Every push has to feel instant to the developer, has to run safely on shared hardware, and has to scale through traffic spikes without degrading.</p><p style="text-align: justify;">The platform that handles all of this is internally codenamed Hive, and it has been powering Vercel&#8217;s builds since late 2023.</p><p style="text-align: justify;">Hive is the reason behind the 90-to-5 transformation. In this article, we examine the constraints Vercel faced, the choices they made in response, and the optimizations that produced the speedup.</p><p style="text-align: justify;"><em>Disclaimer: This post is based on publicly shared </em>details<em> from the Vercel Engineering Team. Please comment if you notice any inaccuracies.</em></p><h2>The Trust Problem</h2><p style="text-align: justify;">The architecture rests on a single foundational assumption. Hive operates as if every piece of code it executes might be malicious, running on machines shared by many tenants at once. That assumption influences everything that follows.</p><p style="text-align: justify;">It matters because the trust calculation flips entirely between two situations. When a team runs its own code on its own server, the goal is performance and convenience. The code trusts the machine, and the machine trusts itself. When the code comes from someone else and runs on shared hardware, the calculation changes. The platform has to assume the code might try to break out of its sandbox, read another customer&#8217;s secrets, or interfere with builds running on the same machine. This is hostile multi-tenancy, and it is a different infrastructure problem from running cooperative workloads.</p><p style="text-align: justify;">Vercel sits squarely in this harder category.</p><p style="text-align: justify;">Every customer push is, from Vercel&#8217;s perspective, code written by someone the team has never met, running on a machine that is also running other customers&#8217; code at the same time. The build script could be a normal Next.js compilation, or it could be a deliberately crafted exploit designed to escape the sandbox. Vercel has to handle both cases identically, since the platform cannot tell the difference in advance.</p><p style="text-align: justify;">The obvious answer is to run each build inside a Docker container.</p><p style="text-align: justify;">Containers are how modern infrastructure runs isolated workloads, and most engineers reach for them by reflex. The problem is that containers were designed primarily as a packaging tool, with isolation as a useful side effect. Multiple containers on the same machine all share the same Linux kernel, which is the part of the operating system with direct access to the hardware. Anything that breaks through the kernel can reach other parts of the machine.</p><p style="text-align: justify;">For most workloads, this risk is acceptable, since most workloads are cooperative. A team&#8217;s own microservices have no incentive to attack each other. However, for running strangers&#8217; build scripts at scale, the risk profile is different. A single kernel exploit in one customer&#8217;s build could reach every other customer&#8217;s build on the same machine, and the blast radius would be enormous.</p><p style="text-align: justify;">This is why standard container orchestration was a poor fit.</p><p style="text-align: justify;">Tools like Kubernetes assume cooperative tenants and provide good isolation by default, but not adversarial isolation. Adding hardening on top of Kubernetes was an option, but for a constraint as foundational as tenant isolation, building from primitives gave Vercel more leverage. Containers leave a gap that Vercel could not afford to leave open. The question was how to close that gap without giving up the speed that containers provide.</p><p style="text-align: justify;">See the diagram below that offers some insight:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PY5G!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa832b83e-9c73-416c-87d4-a6b5607507d4_2010x1280.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PY5G!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa832b83e-9c73-416c-87d4-a6b5607507d4_2010x1280.png 424w, https://substackcdn.com/image/fetch/$s_!PY5G!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa832b83e-9c73-416c-87d4-a6b5607507d4_2010x1280.png 848w, https://substackcdn.com/image/fetch/$s_!PY5G!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa832b83e-9c73-416c-87d4-a6b5607507d4_2010x1280.png 1272w, https://substackcdn.com/image/fetch/$s_!PY5G!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa832b83e-9c73-416c-87d4-a6b5607507d4_2010x1280.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PY5G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa832b83e-9c73-416c-87d4-a6b5607507d4_2010x1280.png" width="1456" height="927" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a832b83e-9c73-416c-87d4-a6b5607507d4_2010x1280.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:927,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:144844,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675721?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa832b83e-9c73-416c-87d4-a6b5607507d4_2010x1280.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PY5G!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa832b83e-9c73-416c-87d4-a6b5607507d4_2010x1280.png 424w, https://substackcdn.com/image/fetch/$s_!PY5G!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa832b83e-9c73-416c-87d4-a6b5607507d4_2010x1280.png 848w, https://substackcdn.com/image/fetch/$s_!PY5G!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa832b83e-9c73-416c-87d4-a6b5607507d4_2010x1280.png 1272w, https://substackcdn.com/image/fetch/$s_!PY5G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa832b83e-9c73-416c-87d4-a6b5607507d4_2010x1280.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>MicroVMs and Firecracker</h2><p style="text-align: justify;">The traditional alternative to containers is the virtual machine.</p><p style="text-align: justify;">A virtual machine runs a complete operating system on top of a virtualization layer, which means two VMs on the same physical machine each have their own kernel. A kernel exploit in one VM cannot reach the other, since the kernels are genuinely separate. The downside is weight. A traditional VM might take 30 to 60 seconds to boot and consume hundreds of megabytes of memory just to exist. For a workload like web hosting, where a single VM runs for months, that overhead is fine. For a workload like running a 2-minute build and then throwing the environment away, it becomes wasteful.</p><p style="text-align: justify;">Around 2018, AWS released Firecracker, an open-source virtualization tool that strips a VM down to the minimum needed to run one short-lived workload.</p><p style="text-align: justify;">Firecracker microVMs boot in around 125 milliseconds and use only a few megabytes of memory each. They provide VM-level isolation, with separate kernels and a hardware-enforced boundary that the CPU itself maintains, at something close to container-level speed. This is a new shape in the isolation tradeoff space, occupying a corner that did not exist before.</p><p style="text-align: justify;">AWS originally built Firecracker to power Lambda, where it now runs at production scale across millions of concurrent functions. That track record gave Vercel a battle-tested foundation rather than an experimental one.</p><p style="text-align: justify;">Vercel adopted Firecracker as the core of Hive. Each customer build runs in a microVM that Vercel calls a cell, and the relationship between cells and Firecracker processes is strictly one-to-one. Each Firecracker process manages exactly one cell, and each cell handles exactly one build. Inside the cell sits a container that runs the actual build script. The container handles packaging, since it carries all the build tools and dependencies the customer&#8217;s project needs. The microVM handles isolation, since it provides the kernel-level boundary that containers alone cannot. Each layer does what it is good at.</p><p style="text-align: justify;">This setup is the architectural answer to the trust problem.</p><p style="text-align: justify;">Vercel can now run a strange piece of code with confidence that, even if the code attempts something hostile, it cannot reach beyond the cell it is running in. The microVM is the wall, and the wall is enforced by the CPU&#8217;s virtualization features rather than by software alone. Firecracker provides the isolation primitive, while the rest of Hive is the machinery that turns one isolated cell into a system capable of running thousands of builds across the world.</p><h2>Inside Hive</h2><p style="text-align: justify;">Hive has a small vocabulary, since the names map directly to physical and logical pieces of the system.</p><ul><li><p style="text-align: justify;">A Hive is a regional cluster, and multiple Hives can exist in a single region.</p></li><li><p style="text-align: justify;">A Box is a physical machine inside a Hive.</p></li><li><p style="text-align: justify;">A Cell is a microVM running on a Box.</p></li><li><p style="text-align: justify;">The Control Plane is the brain of the cluster, and the API is the entry point that the rest of Vercel&#8217;s systems talk to.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4rvh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5280985-afc9-4b10-96b3-91807a497ad7_2010x1344.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4rvh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5280985-afc9-4b10-96b3-91807a497ad7_2010x1344.png 424w, https://substackcdn.com/image/fetch/$s_!4rvh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5280985-afc9-4b10-96b3-91807a497ad7_2010x1344.png 848w, https://substackcdn.com/image/fetch/$s_!4rvh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5280985-afc9-4b10-96b3-91807a497ad7_2010x1344.png 1272w, https://substackcdn.com/image/fetch/$s_!4rvh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5280985-afc9-4b10-96b3-91807a497ad7_2010x1344.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4rvh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5280985-afc9-4b10-96b3-91807a497ad7_2010x1344.png" width="1456" height="974" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d5280985-afc9-4b10-96b3-91807a497ad7_2010x1344.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:974,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:193443,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675721?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5280985-afc9-4b10-96b3-91807a497ad7_2010x1344.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4rvh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5280985-afc9-4b10-96b3-91807a497ad7_2010x1344.png 424w, https://substackcdn.com/image/fetch/$s_!4rvh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5280985-afc9-4b10-96b3-91807a497ad7_2010x1344.png 848w, https://substackcdn.com/image/fetch/$s_!4rvh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5280985-afc9-4b10-96b3-91807a497ad7_2010x1344.png 1272w, https://substackcdn.com/image/fetch/$s_!4rvh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5280985-afc9-4b10-96b3-91807a497ad7_2010x1344.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Here is a different view of the same diagram:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Mmu8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c578de-cf87-4731-b4be-2c7bc887cf55_2070x1282.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Mmu8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c578de-cf87-4731-b4be-2c7bc887cf55_2070x1282.png 424w, https://substackcdn.com/image/fetch/$s_!Mmu8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c578de-cf87-4731-b4be-2c7bc887cf55_2070x1282.png 848w, https://substackcdn.com/image/fetch/$s_!Mmu8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c578de-cf87-4731-b4be-2c7bc887cf55_2070x1282.png 1272w, https://substackcdn.com/image/fetch/$s_!Mmu8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c578de-cf87-4731-b4be-2c7bc887cf55_2070x1282.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Mmu8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c578de-cf87-4731-b4be-2c7bc887cf55_2070x1282.png" width="1456" height="902" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a2c578de-cf87-4731-b4be-2c7bc887cf55_2070x1282.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:902,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:106763,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675721?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c578de-cf87-4731-b4be-2c7bc887cf55_2070x1282.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Mmu8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c578de-cf87-4731-b4be-2c7bc887cf55_2070x1282.png 424w, https://substackcdn.com/image/fetch/$s_!Mmu8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c578de-cf87-4731-b4be-2c7bc887cf55_2070x1282.png 848w, https://substackcdn.com/image/fetch/$s_!Mmu8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c578de-cf87-4731-b4be-2c7bc887cf55_2070x1282.png 1272w, https://substackcdn.com/image/fetch/$s_!Mmu8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2c578de-cf87-4731-b4be-2c7bc887cf55_2070x1282.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">When a customer pushes code, Vercel&#8217;s build pipeline (which is a separate system from Hive) decides which Hive to use based on the customer and the build configuration, and then calls that Hive&#8217;s API to request a cell. The Control Plane finds an available cell on one of the Boxes, hands it to the build pipeline, and the build runs inside the cell&#8217;s container. Once the build finishes, the cell is destroyed, and the resources return to the pool.</p><p style="text-align: justify;">Running multiple Hives per region is a deliberate failure isolation choice. If one Hive has a bad day, the others in the same region keep running. This is finer-grained reliability than running in multiple cloud regions alone, and it means a single bad deploy or infrastructure incident will not take out an entire customer base.</p><p style="text-align: justify;">Two background programs handle the orchestration inside each Box.</p><ul><li><p style="text-align: justify;">A box daemon runs on the physical machine and handles provisioning, spawning new Firecracker processes, and managing the lifecycle of cells.</p></li><li><p style="text-align: justify;">A cell daemon runs inside each microVM and manages the build container that does the actual work.</p></li></ul><p style="text-align: justify;">The two daemons communicate over a socket connection, which is how the orchestration layer scales without becoming a bottleneck. The pattern matters more than the implementation. Responsibility is split between the host machine and the virtual machine, and each side has a clear job.</p><p style="text-align: justify;">See the diagram below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Aue7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb234f5d3-0519-47ec-aabf-08824902cb8c_2328x1282.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Aue7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb234f5d3-0519-47ec-aabf-08824902cb8c_2328x1282.png 424w, https://substackcdn.com/image/fetch/$s_!Aue7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb234f5d3-0519-47ec-aabf-08824902cb8c_2328x1282.png 848w, https://substackcdn.com/image/fetch/$s_!Aue7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb234f5d3-0519-47ec-aabf-08824902cb8c_2328x1282.png 1272w, https://substackcdn.com/image/fetch/$s_!Aue7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb234f5d3-0519-47ec-aabf-08824902cb8c_2328x1282.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Aue7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb234f5d3-0519-47ec-aabf-08824902cb8c_2328x1282.png" width="1456" height="802" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b234f5d3-0519-47ec-aabf-08824902cb8c_2328x1282.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:802,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:160990,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675721?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb234f5d3-0519-47ec-aabf-08824902cb8c_2328x1282.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Aue7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb234f5d3-0519-47ec-aabf-08824902cb8c_2328x1282.png 424w, https://substackcdn.com/image/fetch/$s_!Aue7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb234f5d3-0519-47ec-aabf-08824902cb8c_2328x1282.png 848w, https://substackcdn.com/image/fetch/$s_!Aue7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb234f5d3-0519-47ec-aabf-08824902cb8c_2328x1282.png 1272w, https://substackcdn.com/image/fetch/$s_!Aue7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb234f5d3-0519-47ec-aabf-08824902cb8c_2328x1282.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Each cell receives dedicated CPU and memory, which means those resources are partitioned cleanly between cells on the same Box. Disk and network throughput, however, are rate-limited based on the Box&#8217;s overall capacity rather than dedicated.</p><p style="text-align: justify;">This reflects where multi-tenant isolation hits its practical limits. Some resources are easy to slice up, while others must be shared with quotas because slicing them cleanly would waste too much capacity.</p><p style="text-align: justify;">The other architectural choice is that these cells are ephemeral. Once a build completes, the cell is destroyed rather than reused, even though reusing it would be faster. This is a security choice rather than a performance one. A reused cell would create a path for one customer&#8217;s leftover state, whether that means files on disk, processes still running, or memory contents, to leak into another customer&#8217;s environment. Destroying the cell after every build closes that path entirely.</p><h2>The 90-to-5 Breakdown</h2><p style="text-align: justify;">All of this architecture would still be slow if every build had to spin up a fresh cell from scratch. Even with Firecracker, the cold path takes about 5 seconds per cell, since the system has to boot the microVM, mount the disk, load the build container image, and start the container.</p><p style="text-align: justify;">The 18x improvement comes from three places.</p><p style="text-align: justify;">The first layer is faster boots. Inside each Box, Vercel optimized the cell startup path itself. The build container image is large, so pulling it fresh on every cold start used to add significant time. Vercel now caches the build container image so it loads from a local copy rather than from a remote registry, and that change alone shaved around 45 seconds off VM startup times compared to their previous solution. They also use block device snapshotting, where the disk image of a freshly prepared cell is saved at a known-good moment, and new cells start from that saved copy instead of building up from scratch. These optimizations make the cold path itself dramatically faster.</p><p style="text-align: justify;">The second layer is the warm pool, and this is where most of the speedup happens. Vercel keeps a pool of cells already booted, with the build container image loaded, sitting idle and waiting. When a build comes in, it uses a warm cell and starts running immediately. The 5-second provisioning time only applies when the warm pool is empty, which happens during traffic spikes or for specialized builds like Secure Compute (an enterprise feature with stricter isolation requirements). For the common case, the wait is essentially zero. The warm pool means that most builds skip cold-start provisioning entirely.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NmhJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cf9302-1666-41b6-971d-48220a5269f8_2216x1366.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NmhJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cf9302-1666-41b6-971d-48220a5269f8_2216x1366.png 424w, https://substackcdn.com/image/fetch/$s_!NmhJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cf9302-1666-41b6-971d-48220a5269f8_2216x1366.png 848w, https://substackcdn.com/image/fetch/$s_!NmhJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cf9302-1666-41b6-971d-48220a5269f8_2216x1366.png 1272w, https://substackcdn.com/image/fetch/$s_!NmhJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cf9302-1666-41b6-971d-48220a5269f8_2216x1366.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NmhJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cf9302-1666-41b6-971d-48220a5269f8_2216x1366.png" width="1456" height="898" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/65cf9302-1666-41b6-971d-48220a5269f8_2216x1366.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:898,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:180331,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675721?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cf9302-1666-41b6-971d-48220a5269f8_2216x1366.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NmhJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cf9302-1666-41b6-971d-48220a5269f8_2216x1366.png 424w, https://substackcdn.com/image/fetch/$s_!NmhJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cf9302-1666-41b6-971d-48220a5269f8_2216x1366.png 848w, https://substackcdn.com/image/fetch/$s_!NmhJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cf9302-1666-41b6-971d-48220a5269f8_2216x1366.png 1272w, https://substackcdn.com/image/fetch/$s_!NmhJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65cf9302-1666-41b6-971d-48220a5269f8_2216x1366.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The third layer is Firecracker&#8217;s baseline speed. None of the other features would matter if the underlying virtualization were heavy. Traditional VMs take tens of seconds to boot, which makes warm pools impractical at scale, since the pool would have to be enormous to keep up with demand. Firecracker boots in milliseconds, which is what makes warm pools and snapshotting work in the first place. The whole speed story rests on the foundation Vercel chose at the very beginning.</p><p style="text-align: justify;">Across all builds, Vercel saw a 30% improvement in build performance after switching to Hive. Builds that hit the cold path, where a fresh cell has to be spawned, saw build times drop by about 40%. The provisioning portion of those cold-path builds dropped from 90 seconds to 5. None of these numbers came from a single breakthrough. They came from compounding wins on a foundation that allowed them.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZNfb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0759ce90-ef5f-4c89-85d5-754e43428684_2346x1304.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZNfb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0759ce90-ef5f-4c89-85d5-754e43428684_2346x1304.png 424w, https://substackcdn.com/image/fetch/$s_!ZNfb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0759ce90-ef5f-4c89-85d5-754e43428684_2346x1304.png 848w, https://substackcdn.com/image/fetch/$s_!ZNfb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0759ce90-ef5f-4c89-85d5-754e43428684_2346x1304.png 1272w, https://substackcdn.com/image/fetch/$s_!ZNfb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0759ce90-ef5f-4c89-85d5-754e43428684_2346x1304.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZNfb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0759ce90-ef5f-4c89-85d5-754e43428684_2346x1304.png" width="1456" height="809" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0759ce90-ef5f-4c89-85d5-754e43428684_2346x1304.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:809,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:164959,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675721?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0759ce90-ef5f-4c89-85d5-754e43428684_2346x1304.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZNfb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0759ce90-ef5f-4c89-85d5-754e43428684_2346x1304.png 424w, https://substackcdn.com/image/fetch/$s_!ZNfb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0759ce90-ef5f-4c89-85d5-754e43428684_2346x1304.png 848w, https://substackcdn.com/image/fetch/$s_!ZNfb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0759ce90-ef5f-4c89-85d5-754e43428684_2346x1304.png 1272w, https://substackcdn.com/image/fetch/$s_!ZNfb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0759ce90-ef5f-4c89-85d5-754e43428684_2346x1304.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Costs and Tradeoffs</h2><p style="text-align: justify;">Warm pools come with a real trade-off in terms of cost.</p><p style="text-align: justify;">Keeping cells pre-warmed means Vercel pays for compute that does no useful work most of the time. The amount is substantial, since the right pool size is a constant balance between waste, where too many idle cells burn money, and tail latency, where too few warm cells leave customers waiting during traffic spikes. For a service with bursty traffic, this is an active operations problem rather than a one-time tuning.</p><p style="text-align: justify;">Building Hive at all was the higher cost. Vercel could have used Kubernetes or ECS and added isolation on top, and they would have shipped something serviceable in a fraction of the time. Building from primitives required enormous engineering investment and an ongoing maintenance burden that off-the-shelf tools would have absorbed for them. The reason it was worth doing is that owning the substrate gave Vercel leverage that an opinionated platform would have caused difficulty. The team can make decisions like destroying every cell after every build, or tuning the warm pool based on customer build patterns, without working around someone else&#8217;s design.</p><p style="text-align: justify;">The payoff for that leverage is visible in product features. The same architecture that drove the 90-to-5 number also enabled Vercel to ship enhanced build machines for customers who need extra memory or disk, and to support Secure Compute for enterprise customers with stricter isolation requirements. These features would have been much harder to build on top of a third-party platform, since they would have required either accepting the platform&#8217;s constraints or fighting them at every step. Building the foundation paid off in product capabilities, not just performance numbers.</p><p style="text-align: justify;">This architecture makes sense for Vercel because they have hostile multi-tenancy at scale, and because builds are core to their business rather than incidental to it. For a team running its own builds for its own code on its own machines, microVMs would be wildly over-engineered, and containers are perfectly appropriate. The lesson is the connection between threat model and architecture, rather than a generic recommendation to always use microVMs.</p><h2>Conclusion</h2><p style="text-align: justify;">Vercel&#8217;s story illustrates a pattern that repeats across the industry. Threat model drives architecture. Cooperative tenants can run on containers and standard orchestration. Adversarial tenants require microVMs, isolated runtimes, or sandboxed execution environments. The shape follows from the assumption.</p><p style="text-align: justify;">The broader takeaway from Hive is that Vercel got faster by accepting a harder problem rather than working around it.</p><p style="text-align: justify;">Containers were inadequate for the trust model, microVMs were the right shape, and the speed came from stacking three separate optimizations on top of a deliberately harder foundation. Starting with the constraint and then optimizing for the speed proved to be a stronger bet than starting with the simple thing and trying to make it run quicker.</p><p style="text-align: justify;"><strong>References:</strong></p><ul><li><p style="text-align: justify;"><a href="https://vercel.com/blog/a-deep-dive-into-hive-vercels-builds-infrastructure">A deep dive into Vercel&#8217;s build infrastructure</a></p></li><li><p style="text-align: justify;"><a href="https://firecracker-microvm.github.io/">What is Firecracker</a></p></li></ul>]]></content:encoded></item><item><title><![CDATA[How CockroachDB Built Vector Indexing at Scale]]></title><description><![CDATA[In this article, we will look at how the CockroachDB engineering team built this index and the challenges they faced.]]></description><link>https://blog.bytebytego.com/p/how-cockroachdb-built-vector-indexing</link><guid isPermaLink="false">https://blog.bytebytego.com/p/how-cockroachdb-built-vector-indexing</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Mon, 25 May 2026 15:30:38 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!RPtz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58352b2e-8406-420a-a97d-c11734237c85_2322x1608.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><a href="https://go.bytebytego.com/Redis_052526">Models are no longer the bottleneck. Agent context is. (Sponsored)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/Redis_052526" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ap-L!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c282789-365f-42f1-91d8-848bed3b0e07_2160x2160.png 424w, https://substackcdn.com/image/fetch/$s_!Ap-L!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c282789-365f-42f1-91d8-848bed3b0e07_2160x2160.png 848w, https://substackcdn.com/image/fetch/$s_!Ap-L!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c282789-365f-42f1-91d8-848bed3b0e07_2160x2160.png 1272w, https://substackcdn.com/image/fetch/$s_!Ap-L!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c282789-365f-42f1-91d8-848bed3b0e07_2160x2160.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ap-L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c282789-365f-42f1-91d8-848bed3b0e07_2160x2160.png" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3c282789-365f-42f1-91d8-848bed3b0e07_2160x2160.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2515752,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://go.bytebytego.com/Redis_052526&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675655?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c282789-365f-42f1-91d8-848bed3b0e07_2160x2160.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ap-L!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c282789-365f-42f1-91d8-848bed3b0e07_2160x2160.png 424w, https://substackcdn.com/image/fetch/$s_!Ap-L!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c282789-365f-42f1-91d8-848bed3b0e07_2160x2160.png 848w, https://substackcdn.com/image/fetch/$s_!Ap-L!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c282789-365f-42f1-91d8-848bed3b0e07_2160x2160.png 1272w, https://substackcdn.com/image/fetch/$s_!Ap-L!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c282789-365f-42f1-91d8-848bed3b0e07_2160x2160.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Most AI agents don&#8217;t fail because of the model. They fail because the context is broken&#8212;stale data, fragmented systems, slow retrieval.</p><p>Join Simba Khadder, Head of AI Product &amp; Director of Software Engineering at Redis, on June 10 to see how to turn scattered enterprise data into live, agent-ready context with Redis Iris.</p><p>You&#8217;ll learn:</p><ul><li><p>The four failure modes of how context breaks in production</p></li><li><p>How to make your enterprise data navigable for runtime</p></li><li><p>How Redis Context Retriever, Search, Data Integration, and Agent Memory work together</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/Redis_052526&quot;,&quot;text&quot;:&quot;Reserve your spot &#8594;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://go.bytebytego.com/Redis_052526"><span>Reserve your spot &#8594;</span></a></p><div><hr></div><p style="text-align: justify;">The CockroachDB team wanted to add vector search to their distributed database, and dozens of well-known algorithms already existed.</p><p style="text-align: justify;">To facilitate the decision-making process, they wrote down a list of architectural requirements, including a refusal to depend on any central coordinator, a refusal to allocate large in-memory caches, a need for real-time updates, an intolerance for hot spots, and a requirement to support sharding. Then they checked the list against the popular options.</p><p style="text-align: justify;">Most failed at least one requirement, and some failed several. The team&#8217;s response was to build something new, called C-SPANN, that satisfied every constraint by treating the index as ordinary table data inside CockroachDB rather than as a separate system.</p><p style="text-align: justify;">In this article, we will look at how the CockroachDB engineering team built this index and the challenges they faced.</p><p style="text-align: justify;"><em>Disclaimer: This post is based on publicly shared details from the CockroachDB Engineering Team. Please comment if you notice any inaccuracies.</em></p><h2>Vectors and Approximate Nearest Neighbor Search</h2><p style="text-align: justify;">A vector is a long list of numbers that captures the meaning of something.</p><p style="text-align: justify;">Modern neural networks like the ones behind ChatGPT can take an image, a document, or a snippet of audio and convert it into a vector of floating-point numbers, typically a few hundred to a few thousand dimensions long.</p><p style="text-align: justify;">The useful property of these vectors, often called embeddings, is that similar things produce similar vectors. For example, two photos of beaches end up close to each other in this multi-dimensional space, and a photo of a beach and the word &#8220;beach&#8221; end up in roughly the same neighborhood, which is what makes semantic search possible.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RPtz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58352b2e-8406-420a-a97d-c11734237c85_2322x1608.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RPtz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58352b2e-8406-420a-a97d-c11734237c85_2322x1608.png 424w, https://substackcdn.com/image/fetch/$s_!RPtz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58352b2e-8406-420a-a97d-c11734237c85_2322x1608.png 848w, https://substackcdn.com/image/fetch/$s_!RPtz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58352b2e-8406-420a-a97d-c11734237c85_2322x1608.png 1272w, https://substackcdn.com/image/fetch/$s_!RPtz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58352b2e-8406-420a-a97d-c11734237c85_2322x1608.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RPtz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58352b2e-8406-420a-a97d-c11734237c85_2322x1608.png" width="1456" height="1008" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/58352b2e-8406-420a-a97d-c11734237c85_2322x1608.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1008,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:171626,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675655?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58352b2e-8406-420a-a97d-c11734237c85_2322x1608.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RPtz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58352b2e-8406-420a-a97d-c11734237c85_2322x1608.png 424w, https://substackcdn.com/image/fetch/$s_!RPtz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58352b2e-8406-420a-a97d-c11734237c85_2322x1608.png 848w, https://substackcdn.com/image/fetch/$s_!RPtz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58352b2e-8406-420a-a97d-c11734237c85_2322x1608.png 1272w, https://substackcdn.com/image/fetch/$s_!RPtz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58352b2e-8406-420a-a97d-c11734237c85_2322x1608.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The trick is finding those neighbors quickly when you have billions of vectors to search through.</p><p style="text-align: justify;">Traditional database indexes work because numbers and strings have a natural ordering. We can sort them, store them in a B-tree, and walk that tree to find what you want.</p><p style="text-align: justify;">Vectors do not have that property. Should beach photos come before or after food photos? What about photos of food at the beach? There is no answer, because the data has no inherent sequence, which means a B-tree cannot help you.</p><p style="text-align: justify;">The brute-force alternative is to compare your query vector against every stored vector and return the closest matches. This works fine for a few thousand vectors, but falls apart somewhere in the tens of thousands, and becomes hopeless once you reach the millions.</p><p style="text-align: justify;">Vector indexes solve this by giving up on exact answers. They find approximate nearest neighbors, accepting a small loss of accuracy in exchange for orders of magnitude better performance. The results are usually close enough that real users cannot tell the difference, and the search runs fast enough to feel instant. That tradeoff between accuracy and speed is the foundation of every vector index, and the interesting engineering question is how you make the rest of the system work around it.</p><p style="text-align: justify;">Even with a good algorithm for finding nearest neighbors, plugging it into a distributed transactional database is its own problem. That is where the CockroachDB story actually begins.</p><h2>Architectural Constraints in a Distributed SQL Database</h2><p style="text-align: justify;">CockroachDB is a distributed SQL database. This means that the data lives across multiple machines, often across regions, and the system is designed to scale linearly. It guarantees transactional consistency and supports real-time updates, and all of this has to keep working when machines die, disks fail, or networks partition.</p><p style="text-align: justify;">These properties impose a set of architectural constraints on any new feature, and a vector index is no exception. The CockroachDB team wrote down six requirements that any candidate algorithm had to satisfy.</p><ul><li><p style="text-align: justify;">The first requirement is that no single node can act as a central coordinator. Any node in the cluster should be able to serve reads and writes, because relying on a single leader to direct traffic creates a bottleneck and a single point of failure.</p></li><li><p style="text-align: justify;">The second requirement is that the index cannot rely on large in-memory structures. Index state has to live in persistent storage, since the team could not assume every node has gigabytes of RAM available for caching vectors. They also wanted to avoid the long warm-up times that come with rebuilding in-memory caches after a restart, which matters especially for serverless deployments where nodes spin up and down on demand.</p></li><li><p style="text-align: justify;">The third requirement is that network hops have to stay minimal. Round-trips between nodes are expensive, and any algorithm that requires sequential traversal across the cluster will accumulate latency unpredictably.</p></li><li><p style="text-align: justify;">The fourth requirement is that the index data layout has to be sharding-compatible. Index data has to map naturally to CockroachDB&#8217;s key-value storage so that it can be split, merged, and rebalanced like any other table.</p></li><li><p style="text-align: justify;">The fifth requirement is that the index must avoid creating hot spots. As inserts and queries scale up, the load has to spread across the cluster, because concentrating traffic on a single node defeats the point of running a distributed system in the first place.</p></li><li><p style="text-align: justify;">The sixth requirement is that the index has to support incremental updates. Inserts and deletes need to be applied in real time without blocking queries, requiring batch rebuilds, or degrading search quality over time.</p></li><li></li></ul><p style="text-align: justify;">This list rules out the most popular vector indexes.</p><p style="text-align: justify;">HNSW, the graph-based algorithm that powers pgvector, Weaviate, and many other systems, is excellent on accuracy benchmarks but builds its graph in memory and resists sharding. Classic IVF is closer in spirit but assumes a single-node deployment and struggles with dynamic updates. Specialized vector databases like Pinecone solve these problems by being separate systems entirely, which works fine if you are willing to keep your vectors in one database and your transactional data in another.</p><p style="text-align: justify;">CockroachDB needed something that handled both inside the same system, with the same guarantees.</p><p style="text-align: justify;">Faced with this list, the team built something new. They called it C-SPANN, and the design choices that make it work are mostly about what it does not try to do.</p><h2>The C-SPANN Architecture</h2><p style="text-align: justify;">C-SPANN borrows ideas from three places.</p><p style="text-align: justify;">Microsoft&#8217;s SPANN paper contributed the tree structure for partitioning vectors, the follow-up SPFresh paper contributed techniques for incremental updates, and Google&#8217;s ScaNN project contributed ideas around quantization.</p><p style="text-align: justify;">The CockroachDB team combined these with their distributed SQL architecture to produce something none of the source papers describe directly.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mNGz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8f7a20d-1d32-4e69-ab4a-caca3c50151b_2140x1586.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mNGz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8f7a20d-1d32-4e69-ab4a-caca3c50151b_2140x1586.png 424w, https://substackcdn.com/image/fetch/$s_!mNGz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8f7a20d-1d32-4e69-ab4a-caca3c50151b_2140x1586.png 848w, https://substackcdn.com/image/fetch/$s_!mNGz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8f7a20d-1d32-4e69-ab4a-caca3c50151b_2140x1586.png 1272w, https://substackcdn.com/image/fetch/$s_!mNGz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8f7a20d-1d32-4e69-ab4a-caca3c50151b_2140x1586.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mNGz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8f7a20d-1d32-4e69-ab4a-caca3c50151b_2140x1586.png" width="1456" height="1079" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c8f7a20d-1d32-4e69-ab4a-caca3c50151b_2140x1586.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1079,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:137726,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675655?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8f7a20d-1d32-4e69-ab4a-caca3c50151b_2140x1586.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mNGz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8f7a20d-1d32-4e69-ab4a-caca3c50151b_2140x1586.png 424w, https://substackcdn.com/image/fetch/$s_!mNGz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8f7a20d-1d32-4e69-ab4a-caca3c50151b_2140x1586.png 848w, https://substackcdn.com/image/fetch/$s_!mNGz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8f7a20d-1d32-4e69-ab4a-caca3c50151b_2140x1586.png 1272w, https://substackcdn.com/image/fetch/$s_!mNGz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8f7a20d-1d32-4e69-ab4a-caca3c50151b_2140x1586.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">At the core is a hierarchical K-means tree. Vectors are grouped into partitions based on similarity, where each partition typically contains dozens to hundreds of vectors and has a centroid that represents the average of the vectors it contains. Think of the centroid as the partition&#8217;s center of mass. Those centroids are themselves grouped into higher-level partitions with their own centroids, and that process repeats until you reach a single root partition at the top.</p><p style="text-align: justify;">The result is a wide, shallow tree. With a fanout of around 100, an index of one million vectors needs only three levels, and an index of ten billion vectors needs only five. Searching the tree means starting at the root, comparing the query vector to the centroids at that level, descending into the closest partition, and repeating until you reach the leaves. At each level, partitions can be processed in parallel, which keeps latency low and predictable. At the leaves, the system scans a few hundred candidate vectors using SIMD CPU instructions for speed.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Xwlf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc676be9-e09e-4f9d-991b-615876cf14d9_2606x1594.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Xwlf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc676be9-e09e-4f9d-991b-615876cf14d9_2606x1594.png 424w, https://substackcdn.com/image/fetch/$s_!Xwlf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc676be9-e09e-4f9d-991b-615876cf14d9_2606x1594.png 848w, https://substackcdn.com/image/fetch/$s_!Xwlf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc676be9-e09e-4f9d-991b-615876cf14d9_2606x1594.png 1272w, https://substackcdn.com/image/fetch/$s_!Xwlf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc676be9-e09e-4f9d-991b-615876cf14d9_2606x1594.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Xwlf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc676be9-e09e-4f9d-991b-615876cf14d9_2606x1594.png" width="1456" height="891" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fc676be9-e09e-4f9d-991b-615876cf14d9_2606x1594.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:891,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:170349,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675655?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc676be9-e09e-4f9d-991b-615876cf14d9_2606x1594.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Xwlf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc676be9-e09e-4f9d-991b-615876cf14d9_2606x1594.png 424w, https://substackcdn.com/image/fetch/$s_!Xwlf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc676be9-e09e-4f9d-991b-615876cf14d9_2606x1594.png 848w, https://substackcdn.com/image/fetch/$s_!Xwlf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc676be9-e09e-4f9d-991b-615876cf14d9_2606x1594.png 1272w, https://substackcdn.com/image/fetch/$s_!Xwlf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc676be9-e09e-4f9d-991b-615876cf14d9_2606x1594.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">That much describes the algorithm. The interesting part is what happens to the data structure once it is built.</p><p style="text-align: justify;">Each partition is stored as a self-contained set of key-value rows inside CockroachDB. Partition data lives in CockroachDB ranges, which are the same units of storage that hold every other table in the database. Therefore, the index is not a parallel structure attached to the database. It is table data with extra meaning.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jEGM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221a2032-c5c8-4fa9-88dc-d6c57c7399e1_2382x1816.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jEGM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221a2032-c5c8-4fa9-88dc-d6c57c7399e1_2382x1816.png 424w, https://substackcdn.com/image/fetch/$s_!jEGM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221a2032-c5c8-4fa9-88dc-d6c57c7399e1_2382x1816.png 848w, https://substackcdn.com/image/fetch/$s_!jEGM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221a2032-c5c8-4fa9-88dc-d6c57c7399e1_2382x1816.png 1272w, https://substackcdn.com/image/fetch/$s_!jEGM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221a2032-c5c8-4fa9-88dc-d6c57c7399e1_2382x1816.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jEGM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221a2032-c5c8-4fa9-88dc-d6c57c7399e1_2382x1816.png" width="1456" height="1110" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/221a2032-c5c8-4fa9-88dc-d6c57c7399e1_2382x1816.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1110,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:227290,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675655?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221a2032-c5c8-4fa9-88dc-d6c57c7399e1_2382x1816.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jEGM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221a2032-c5c8-4fa9-88dc-d6c57c7399e1_2382x1816.png 424w, https://substackcdn.com/image/fetch/$s_!jEGM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221a2032-c5c8-4fa9-88dc-d6c57c7399e1_2382x1816.png 848w, https://substackcdn.com/image/fetch/$s_!jEGM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221a2032-c5c8-4fa9-88dc-d6c57c7399e1_2382x1816.png 1272w, https://substackcdn.com/image/fetch/$s_!jEGM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F221a2032-c5c8-4fa9-88dc-d6c57c7399e1_2382x1816.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">This decision pays dividends. CockroachDB already knows how to split a range when it grows too large, how to rebalance ranges across nodes when load shifts, and how to cache frequently accessed rows in its block cache.</p><p style="text-align: justify;">All of this setup applies to vector index data automatically, without writing a single line of new infrastructure code. When a new node joins the cluster, ranges containing index partitions get distributed to it the same way ranges containing user tables do. When a node restarts, the index is immediately ready to serve queries because it lives on disk, rather than in some warm-up cache that has to be rebuilt.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sqIR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfd128-605c-4837-a5b0-19c289b883d8_2732x1264.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sqIR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfd128-605c-4837-a5b0-19c289b883d8_2732x1264.png 424w, https://substackcdn.com/image/fetch/$s_!sqIR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfd128-605c-4837-a5b0-19c289b883d8_2732x1264.png 848w, https://substackcdn.com/image/fetch/$s_!sqIR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfd128-605c-4837-a5b0-19c289b883d8_2732x1264.png 1272w, https://substackcdn.com/image/fetch/$s_!sqIR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfd128-605c-4837-a5b0-19c289b883d8_2732x1264.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sqIR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfd128-605c-4837-a5b0-19c289b883d8_2732x1264.png" width="1456" height="674" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/53dfd128-605c-4837-a5b0-19c289b883d8_2732x1264.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:674,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:102332,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675655?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfd128-605c-4837-a5b0-19c289b883d8_2732x1264.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sqIR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfd128-605c-4837-a5b0-19c289b883d8_2732x1264.png 424w, https://substackcdn.com/image/fetch/$s_!sqIR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfd128-605c-4837-a5b0-19c289b883d8_2732x1264.png 848w, https://substackcdn.com/image/fetch/$s_!sqIR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfd128-605c-4837-a5b0-19c289b883d8_2732x1264.png 1272w, https://substackcdn.com/image/fetch/$s_!sqIR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F53dfd128-605c-4837-a5b0-19c289b883d8_2732x1264.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">However, building the index is one thing. Keeping it healthy as data flows in and out is harder, especially when the index needs to be compressed aggressively to stay affordable.</p><h2>Index Maintenance, Quantization, and Multi-Tenant Partitioning</h2><p style="text-align: justify;">A K-means tree is not static. Partitions grow as vectors are inserted and shrink as vectors are deleted, so the system needs background machinery to keep partitions at a reasonable size and to keep vectors grouped with their nearest centroids.</p><p style="text-align: justify;">When a partition grows too large, C-SPANN splits it. A balanced variant of the K-means algorithm divides the vectors into two roughly equal groups, each with its own new centroid, and the tree is updated so that future inserts route to whichever new partition is closer. When a partition shrinks too small, the system merges it away and reassigns its vectors to neighboring partitions. Both operations happen in the background to avoid interfering with foreground transactions.</p><p style="text-align: justify;">There is one point worth noting.</p><p style="text-align: justify;">After a split, some vectors in the original partition might actually be closer to a neighboring partition&#8217;s centroid than to either of the two new centroids, and they get reassigned. Likewise, a vector in a neighboring partition might now be closer to one of the new centroids and migrate in. This idea, called nearest partition assignment, comes from the SPFresh paper and is what keeps the index accurate over time.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IXN9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe62192a-18ff-4780-b7a0-c571b7d62e03_2546x1380.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IXN9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe62192a-18ff-4780-b7a0-c571b7d62e03_2546x1380.png 424w, https://substackcdn.com/image/fetch/$s_!IXN9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe62192a-18ff-4780-b7a0-c571b7d62e03_2546x1380.png 848w, https://substackcdn.com/image/fetch/$s_!IXN9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe62192a-18ff-4780-b7a0-c571b7d62e03_2546x1380.png 1272w, https://substackcdn.com/image/fetch/$s_!IXN9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe62192a-18ff-4780-b7a0-c571b7d62e03_2546x1380.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IXN9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe62192a-18ff-4780-b7a0-c571b7d62e03_2546x1380.png" width="1456" height="789" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/be62192a-18ff-4780-b7a0-c571b7d62e03_2546x1380.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:789,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:184162,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198675655?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe62192a-18ff-4780-b7a0-c571b7d62e03_2546x1380.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IXN9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe62192a-18ff-4780-b7a0-c571b7d62e03_2546x1380.png 424w, https://substackcdn.com/image/fetch/$s_!IXN9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe62192a-18ff-4780-b7a0-c571b7d62e03_2546x1380.png 848w, https://substackcdn.com/image/fetch/$s_!IXN9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe62192a-18ff-4780-b7a0-c571b7d62e03_2546x1380.png 1272w, https://substackcdn.com/image/fetch/$s_!IXN9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbe62192a-18ff-4780-b7a0-c571b7d62e03_2546x1380.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The consequence is that you can start with an empty table, insert millions of vectors, and end up with an accurate, well-balanced index without ever rebuilding it. The maintenance setup handles everything incrementally.</p><p style="text-align: justify;">The other operational factor is size. An OpenAI embedding has 1,536 dimensions stored as 2-byte floats, which works out to about 3 KB per vector. A billion vectors at full precision is 3 TB just for the embeddings, before any indexing overhead is counted. Storing and scanning that much data is expensive both in disk space and in the CPU and memory used during search.</p><p style="text-align: justify;">C-SPANN compresses vectors using a technique called RaBitQ, which reduces each dimension to a single bit. The compressed representation is roughly 200 bytes per vector, a 94 percent size reduction. The math behind the compression involves a random orthogonal transform that preserves distances while spreading data evenly across dimensions</p><p style="text-align: justify;">What matters for the system is that quantization is lossy, so distance estimates from compressed vectors are only approximate. C-SPANN compensates with a reranking step, where the system scans quantized vectors to assemble a candidate set, then fetches the original full-precision vectors for those candidates to compute exact distances. By fetching candidates, the system can absorb quantization error and still return accurate results. The pattern of cheap approximate filtering followed by precise refinement on a small candidate set shows up in many other systems too, and recognizing it here makes it easier to spot elsewhere.</p><p style="text-align: justify;">The third operational reality is multi-tenancy. In most real applications, vectors belong to someone, whether a user, a customer, or an organization, and most queries are scoped to a single owner. Mixing one user&#8217;s vectors with another&#8217;s during search is wasteful, and it is also a security problem.</p><p style="text-align: justify;">CockroachDB handles this through prefix columns on the vector index. Here is what the schema looks like.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;plaintext&quot;,&quot;nodeId&quot;:&quot;bf631e44-30a9-48b3-93cf-c71703f7491c&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-plaintext">CREATE TABLE photos (
  id UUID PRIMARY KEY,
  user_id UUID,
  embedding VECTOR(1536),
  VECTOR INDEX (user_id, embedding)
);</code></pre></div><p>A query for one user&#8217;s nearest matches uses pgvector-compatible syntax.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;dockerfile&quot;,&quot;nodeId&quot;:&quot;ea4ff9ee-c7fc-48c5-adbe-360ca87e232a&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-dockerfile">SELECT id
FROM photos
WHERE user_id = $1
ORDER BY embedding &lt;-&gt; $2
LIMIT 10;</code></pre></div><p style="text-align: justify;">Behind the scenes, the index maintains a separate K-means tree for each distinct user. Performance scales with how many vectors the user owns, rather than with the total size of the index, so a billion vectors split across a million users behaves, from any one user&#8217;s perspective, like a million-vector index.</p><p style="text-align: justify;">Combined with CockroachDB&#8217;s REGIONAL BY ROW tables, prefix columns can also partition the index by geography. For example, a user in Europe gets their data and their index entries stored in a European region, with fast local access and compliance with data domiciling requirements, while the same table serves a US user with equally low latency from US infrastructure. The combination of region, ownership, and embedding as prefix columns produces an index that is efficient, secure, and locality-aware by default.</p><h2>Conclusion</h2><p style="text-align: justify;">C-SPANN refused several compromises that most vector databases quietly accept.</p><p style="text-align: justify;">Freshness in CockroachDB is real-time and transactional rather than batched or eventually consistent, which means a vector inserted in a transaction becomes searchable as soon as that transaction commits, with the same consistency guarantees as any other write. Scaling is native to the distributed architecture rather than a feature retrofitted onto a single-node system, and vectors live alongside transactional data in the same database, inside the same queries, under the same operational umbrella. Since the index lives on disk, nodes serve queries immediately after a restart without any warm-up phase.</p><p style="text-align: justify;">In return, the team accepted some real limitations. The 25.2 release is a preview, and several optimizations are still being built, including fuller SIMD usage, root partition caching, and complete merge support. The current implementation supports only Euclidean distance, with cosine and inner product on the roadmap. Filtering on non-prefix columns is limited today, though that scope is expanding. Also, on raw vector search benchmarks against specialized in-memory systems, C-SPANN does not win on pure latency.</p><p style="text-align: justify;">The tradeoff suggests where this design fits and where it does not. CockroachDB&#8217;s vector index is a strong choice for applications where vectors and transactional data need to coexist, where multi-tenant isolation matters, and where multi-region deployment with data domiciling is a requirement. Specialized vector databases remain a better fit for pure vector workloads with no transactional component, for read-heavy batch-updated datasets where freshness is not a concern, and for cases where every microsecond of search latency is critical.</p><p style="text-align: justify;">The architectural logic underneath all of this is worth keeping in mind. The CockroachDB team treated the vector index as ordinary table data and inherited their existing distributed machinery for free, so splits, caching, sharding, replication, and multi-region behavior all worked from day one because they already worked for everything else in the database. The algorithm is the part that gets the headlines, but the integration is what makes the system possible.</p><p style="text-align: justify;"><strong>References:</strong></p><ul><li><p><a href="https://www.cockroachlabs.com/blog/cspann-real-time-indexing-billions-vectors/">Real-Time Indexing for Billions of Vectors: How we built fast, fresh vector indexing at scale in CockroachDB</a></p></li><li><p><a href="https://en.wikipedia.org/wiki/Vector_database">Vector databases</a></p></li></ul>]]></content:encoded></item><item><title><![CDATA[EP216: RAGs vs Agents]]></title><description><![CDATA[Ask an LLM about your company's data and it will guess. The two patterns that fix this are RAG and agents, and they solve different problems.]]></description><link>https://blog.bytebytego.com/p/ep216-rags-vs-agents</link><guid isPermaLink="false">https://blog.bytebytego.com/p/ep216-rags-vs-agents</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Sat, 23 May 2026 15:31:18 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!eLEi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52ccd4c2-dc53-4b9a-95b9-d87d6c353f88_2484x3002.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><a href="https://go.bytebytego.com/QAWolf_052326Headline">Map workflows, automate E2E tests, and ship faster with QA Wolf (Sponsored)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/QAWolf_052326CTA" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jXWJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6039af28-3c33-437b-8459-b826572b02cf_1200x628.png 424w, https://substackcdn.com/image/fetch/$s_!jXWJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6039af28-3c33-437b-8459-b826572b02cf_1200x628.png 848w, https://substackcdn.com/image/fetch/$s_!jXWJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6039af28-3c33-437b-8459-b826572b02cf_1200x628.png 1272w, https://substackcdn.com/image/fetch/$s_!jXWJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6039af28-3c33-437b-8459-b826572b02cf_1200x628.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jXWJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6039af28-3c33-437b-8459-b826572b02cf_1200x628.png" width="1200" height="628" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6039af28-3c33-437b-8459-b826572b02cf_1200x628.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:628,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:90284,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://go.bytebytego.com/QAWolf_052326CTA&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198874402?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6039af28-3c33-437b-8459-b826572b02cf_1200x628.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jXWJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6039af28-3c33-437b-8459-b826572b02cf_1200x628.png 424w, https://substackcdn.com/image/fetch/$s_!jXWJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6039af28-3c33-437b-8459-b826572b02cf_1200x628.png 848w, https://substackcdn.com/image/fetch/$s_!jXWJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6039af28-3c33-437b-8459-b826572b02cf_1200x628.png 1272w, https://substackcdn.com/image/fetch/$s_!jXWJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6039af28-3c33-437b-8459-b826572b02cf_1200x628.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><a href="https://go.bytebytego.com/QAWolf_052326QAWolf">QA Wolf&#8217;s</a> AI agent maps and tests your app&#8217;s most complex user flows. It turns your prompts into real Playwright and Appium code that runs 12x faster and more reliably than other computer-use agents.</p><p>What sets our AI apart:</p><ul><li><p>Maps <strong>200+ test cases in minutes</strong> instead of weeks of manual planning.</p></li><li><p>Executes tests <strong>12x faster</strong> than computer-use agents.</p></li><li><p>Runs entire suites <strong>100% parallel</strong> with consistent results.</p></li><li><p>Produces open-source tests your team owns, with <strong>zero vendor lock-in</strong>.</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/QAWolf_052326CTA&quot;,&quot;text&quot;:&quot;Get started today&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://go.bytebytego.com/QAWolf_052326CTA"><span>Get started today</span></a></p><div><hr></div><p>This week&#8217;s system design refresher:</p><ul><li><p>RAGs vs Agents</p></li><li><p>Build with Claude Code: New Cohort Launch</p></li><li><p>Forward Proxy, Reverse Proxy, and API Gateway Explained</p></li><li><p>How does a request actually travel through Claude Code?</p></li><li><p>How does Claude Code keep long sessions from running out of context?</p></li></ul><div><hr></div><h2>RAGs vs Agents</h2><p>Ask an LLM about your company's data and it will guess. The two patterns that fix this are RAG and agents, and they solve different problems.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eLEi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52ccd4c2-dc53-4b9a-95b9-d87d6c353f88_2484x3002.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eLEi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52ccd4c2-dc53-4b9a-95b9-d87d6c353f88_2484x3002.png 424w, https://substackcdn.com/image/fetch/$s_!eLEi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52ccd4c2-dc53-4b9a-95b9-d87d6c353f88_2484x3002.png 848w, https://substackcdn.com/image/fetch/$s_!eLEi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52ccd4c2-dc53-4b9a-95b9-d87d6c353f88_2484x3002.png 1272w, https://substackcdn.com/image/fetch/$s_!eLEi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52ccd4c2-dc53-4b9a-95b9-d87d6c353f88_2484x3002.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eLEi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52ccd4c2-dc53-4b9a-95b9-d87d6c353f88_2484x3002.png" width="1456" height="1760" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/52ccd4c2-dc53-4b9a-95b9-d87d6c353f88_2484x3002.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1760,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!eLEi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52ccd4c2-dc53-4b9a-95b9-d87d6c353f88_2484x3002.png 424w, https://substackcdn.com/image/fetch/$s_!eLEi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52ccd4c2-dc53-4b9a-95b9-d87d6c353f88_2484x3002.png 848w, https://substackcdn.com/image/fetch/$s_!eLEi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52ccd4c2-dc53-4b9a-95b9-d87d6c353f88_2484x3002.png 1272w, https://substackcdn.com/image/fetch/$s_!eLEi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52ccd4c2-dc53-4b9a-95b9-d87d6c353f88_2484x3002.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>RAGs: RAGs combine LLMs with retrieval to ground answers in 4 steps.</p><ul><li><p>Step 1: The user query is embedded and sent to a retrieval step. </p></li><li><p>Step 2: Retrieval pulls the most relevant chunks from a knowledge base (PDFs, wikis, etc.)</p></li><li><p>Step 3: Those chunks are pasted into the prompt as context. </p></li><li><p>Step 4: The LLM writes the answer, grounded in the retrieved text.</p></li></ul><p>One retrieval. One generation. Cheap, predictable, and easy to debug.</p><p>Agents: Agents wrap LLMs in a reasoning loop with tools to take action.</p><ul><li><p>Step 1: The user query goes into the agent runtime. A reasoning loop wrapped around an LLM. </p></li><li><p>Step 2: The LLM reads the goal and picks a tool (Read, Write, Edit, Bash, etc.)</p></li><li><p>Step 3: The runtime executes the tool and feeds the result back to the LLM. </p></li><li><p>Step 4: The LLM reasons again, picks the next tool, and loops until the task is done.</p></li></ul><p>More flexible. More tokens. Harder to debug because errors drift across steps.</p><p>The rule of thumb: Use RAG when the answer lives in your documents. Use an agent when the answer requires action on other systems.</p><p>Over to you: When do you prefer RAG over agent?</p><div><hr></div><h2>Build with Claude Code: New Cohort Launch</h2><p>We&#8217;re launching a new 2 day intensive, cohort based course called Build with Claude Code, taught by John Kim, who has trained hundreds of engineers at Meta to use Claude Code in real production workflows.</p><p><strong>The course starts soon on May 28.</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/claude-c1-substack&quot;,&quot;text&quot;:&quot;Check it out now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://go.bytebytego.com/claude-c1-substack"><span>Check it out now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/claude-c1-substack" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Ym7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png 424w, https://substackcdn.com/image/fetch/$s_!0Ym7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png 848w, https://substackcdn.com/image/fetch/$s_!0Ym7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png 1272w, https://substackcdn.com/image/fetch/$s_!0Ym7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Ym7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png" width="1456" height="1801" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1801,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:854119,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://go.bytebytego.com/claude-c1-substack&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198752484?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!0Ym7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png 424w, https://substackcdn.com/image/fetch/$s_!0Ym7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png 848w, https://substackcdn.com/image/fetch/$s_!0Ym7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png 1272w, https://substackcdn.com/image/fetch/$s_!0Ym7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>A few things you&#8217;ll learn:</p><ul><li><p>The agentic loop, context engineering, and memory layers that make Claude Code useful for real projects</p></li><li><p>How to build with Claude Code Skills, MCPs, and hooks to give Claude the tools and feedback loops it needs to self correct</p></li><li><p>Parallel development with Git worktrees, subagents, and agent teams</p></li><li><p>A capstone project where you ship something real on your own stack</p></li></ul><p>The course includes live sessions, assignments, and office hours, so there&#8217;s plenty of room to ask questions and get unstuck.</p><p>The first cohort starts in just a few days: May 28 to 29, 2026. If you want to learn everything from the fundamentals of Claude Code to advanced production workflows, including working with large codebases, this could be a great way to level up.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/claude-c1-substack&quot;,&quot;text&quot;:&quot;Check it out now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://go.bytebytego.com/claude-c1-substack"><span>Check it out now</span></a></p><div><hr></div><h2>Forward Proxy, Reverse Proxy, and API Gateway Explained</h2><p>People mix these up all the time, since they all sit between a client and a server. The real difference is which side they represent and what problem they solve.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cqW0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e0de719-d0ac-497e-8f9c-b9c841422db8_2484x3002.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cqW0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e0de719-d0ac-497e-8f9c-b9c841422db8_2484x3002.png 424w, https://substackcdn.com/image/fetch/$s_!cqW0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e0de719-d0ac-497e-8f9c-b9c841422db8_2484x3002.png 848w, https://substackcdn.com/image/fetch/$s_!cqW0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e0de719-d0ac-497e-8f9c-b9c841422db8_2484x3002.png 1272w, https://substackcdn.com/image/fetch/$s_!cqW0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e0de719-d0ac-497e-8f9c-b9c841422db8_2484x3002.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cqW0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e0de719-d0ac-497e-8f9c-b9c841422db8_2484x3002.png" width="1456" height="1760" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2e0de719-d0ac-497e-8f9c-b9c841422db8_2484x3002.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1760,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!cqW0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e0de719-d0ac-497e-8f9c-b9c841422db8_2484x3002.png 424w, https://substackcdn.com/image/fetch/$s_!cqW0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e0de719-d0ac-497e-8f9c-b9c841422db8_2484x3002.png 848w, https://substackcdn.com/image/fetch/$s_!cqW0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e0de719-d0ac-497e-8f9c-b9c841422db8_2484x3002.png 1272w, https://substackcdn.com/image/fetch/$s_!cqW0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2e0de719-d0ac-497e-8f9c-b9c841422db8_2484x3002.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A forward proxy sits next to the client. Your laptop sends a request, the proxy forwards it out, and the destination never sees your real IP. Corporate networks use this to enforce policy, block sites, and cache traffic.</p><p>A reverse proxy sits next to the server. The client has no idea how many machines are behind it. The proxy decides who handles the request, terminates TLS, and keeps your backend off the public internet. NGINX and HAProxy are commonly used here, typically paired with a load balancer in front.</p><p>An API gateway is a reverse proxy that does more than route traffic. It also handles auth, rate limits, API keys, versioning, and request shaping. Without it, each microservice has to implement its own version of validation, throttling logic, and request logging.</p><p>A forward proxy represents the client, a reverse proxy represents the server, and an API gateway is what you add when ten services need the same authentication and rate limiting rules applied consistently.</p><p>In most real systems, all three are running at different layers. The forward proxy filters outbound traffic, the reverse proxy fronts the application servers, and the API gateway sits in front of your APIs to enforce policies before requests reach them.</p><p>Over to you: What's your proxy + gateway combo? Always interesting to see what teams pair together.</p><div><hr></div><h2>How does a request actually travel through Claude Code?</h2><p>Most of us type a prompt and watch the magic happen. The diagram below shows what's really going on behind the curtain, based on the Claude Code source code.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VfjL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38f19060-e838-440a-971d-bdc1f2d3a167_1280x1546.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VfjL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38f19060-e838-440a-971d-bdc1f2d3a167_1280x1546.jpeg 424w, https://substackcdn.com/image/fetch/$s_!VfjL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38f19060-e838-440a-971d-bdc1f2d3a167_1280x1546.jpeg 848w, https://substackcdn.com/image/fetch/$s_!VfjL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38f19060-e838-440a-971d-bdc1f2d3a167_1280x1546.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!VfjL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38f19060-e838-440a-971d-bdc1f2d3a167_1280x1546.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VfjL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38f19060-e838-440a-971d-bdc1f2d3a167_1280x1546.jpeg" width="1280" height="1546" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/38f19060-e838-440a-971d-bdc1f2d3a167_1280x1546.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1546,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;graphical user interface, application&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="graphical user interface, application" title="graphical user interface, application" srcset="https://substackcdn.com/image/fetch/$s_!VfjL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38f19060-e838-440a-971d-bdc1f2d3a167_1280x1546.jpeg 424w, https://substackcdn.com/image/fetch/$s_!VfjL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38f19060-e838-440a-971d-bdc1f2d3a167_1280x1546.jpeg 848w, https://substackcdn.com/image/fetch/$s_!VfjL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38f19060-e838-440a-971d-bdc1f2d3a167_1280x1546.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!VfjL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38f19060-e838-440a-971d-bdc1f2d3a167_1280x1546.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Let's trace one real request: "Fix the failing test in auth.test.ts."</p><ul><li><p>Step 1: The user sends a prompt to Claude Code through their interface.</p></li><li><p>Step 2: The interface (CLI, IDE, or SDK) wraps the prompt with repo and file context and hands it to the agent loop as a request.</p></li><li><p>Step 3: The agent loop plans the next move and proposes an action: Edit(auth.ts, lines 42&#8211;58).</p></li><li><p>Step 4: The permission system checks the proposed action against the rules.</p></li><li><p>Step 5: The approved action becomes a tool call: Edit(auth.ts, patch), dispatched to the matching tool.</p></li><li><p>Step 6: The tool runs in the execution environment (shell, cloud, or sandbox) as a real syscall.</p></li><li><p>Step 7: The execution returns a tool result back to the agent loop.</p></li><li><p>Step 8: The agent persists the turn to state and streams the final message to the user.</p></li></ul><p>The whole system is just this loop, repeated until the model stops asking for tools.</p><p>Over to you: which step in this loop do you think is the hardest one to get right when building your own coding agent?</p><div><hr></div><h2>How does Claude Code keep long sessions from running out of context?</h2><p>It uses 5 strategies, run in sequence before every model call. Each one only runs if the previous doesn&#8217;t free enough room.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fap_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc111a2c6-71bd-45a7-8938-e8bb4f772cbc_1378x1390.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fap_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc111a2c6-71bd-45a7-8938-e8bb4f772cbc_1378x1390.png 424w, https://substackcdn.com/image/fetch/$s_!fap_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc111a2c6-71bd-45a7-8938-e8bb4f772cbc_1378x1390.png 848w, https://substackcdn.com/image/fetch/$s_!fap_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc111a2c6-71bd-45a7-8938-e8bb4f772cbc_1378x1390.png 1272w, https://substackcdn.com/image/fetch/$s_!fap_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc111a2c6-71bd-45a7-8938-e8bb4f772cbc_1378x1390.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fap_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc111a2c6-71bd-45a7-8938-e8bb4f772cbc_1378x1390.png" width="1378" height="1390" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c111a2c6-71bd-45a7-8938-e8bb4f772cbc_1378x1390.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1390,&quot;width&quot;:1378,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fap_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc111a2c6-71bd-45a7-8938-e8bb4f772cbc_1378x1390.png 424w, https://substackcdn.com/image/fetch/$s_!fap_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc111a2c6-71bd-45a7-8938-e8bb4f772cbc_1378x1390.png 848w, https://substackcdn.com/image/fetch/$s_!fap_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc111a2c6-71bd-45a7-8938-e8bb4f772cbc_1378x1390.png 1272w, https://substackcdn.com/image/fetch/$s_!fap_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc111a2c6-71bd-45a7-8938-e8bb4f772cbc_1378x1390.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol><li><p>Budget Reduction: caps individual tool results. Oversized outputs are swapped for a content reference.</p></li><li><p>Snip: trims the oldest history segments and emits a boundary marker.</p></li><li><p>Microcompact: prunes tool turns by tool_use_id so the prompt cache stays warm.</p></li><li><p>Context Collapse: a read-time projection over the full history.</p></li><li><p>Auto-compact: the last resort. It calls the model to produce a full summary of prior turns.</p></li></ol><p>The pattern is lazy degradation: apply the least disruptive shaper first, escalate only when cheaper layers prove insufficient.</p><p>Over to you: how often do you run out of context?</p><p></p>]]></content:encoded></item><item><title><![CDATA[Build with Claude Code: New Cohort Launch]]></title><description><![CDATA[The first cohort starts in about a week: May 28-29, 2026.]]></description><link>https://blog.bytebytego.com/p/build-with-claude-code-new-cohort</link><guid isPermaLink="false">https://blog.bytebytego.com/p/build-with-claude-code-new-cohort</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Fri, 22 May 2026 15:31:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!0Ym7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We&#8217;re launching a new 2 day intensive, cohort based course called Build with Claude Code, taught by John Kim, who has trained hundreds of engineers at Meta to use Claude Code in real production workflows.</p><p><strong>The course starts soon on May 28.</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/claude-c1-substack&quot;,&quot;text&quot;:&quot;Check it out now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://go.bytebytego.com/claude-c1-substack"><span>Check it out now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/claude-c1-substack" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Ym7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png 424w, https://substackcdn.com/image/fetch/$s_!0Ym7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png 848w, https://substackcdn.com/image/fetch/$s_!0Ym7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png 1272w, https://substackcdn.com/image/fetch/$s_!0Ym7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Ym7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png" width="1456" height="1801" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1801,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:854119,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://go.bytebytego.com/claude-c1-substack&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/198752484?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0Ym7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png 424w, https://substackcdn.com/image/fetch/$s_!0Ym7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png 848w, https://substackcdn.com/image/fetch/$s_!0Ym7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png 1272w, https://substackcdn.com/image/fetch/$s_!0Ym7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60dca5aa-afd4-4f29-9233-3b5d82c84154_2360x2920.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"></figcaption></figure></div><p>A few things you&#8217;ll learn:</p><ul><li><p>The agentic loop, context engineering, and memory layers that make Claude Code useful for real projects</p></li><li><p>How to build with Claude Code Skills, MCPs, and hooks to give Claude the tools and feedback loops it needs to self correct</p></li><li><p>Parallel development with Git worktrees, subagents, and agent teams</p></li><li><p>A capstone project where you ship something real on your own stack</p></li></ul><p>The course includes live sessions, assignments, and office hours, so there&#8217;s plenty of room to ask questions and get unstuck.</p><p>The first cohort starts in just a few days: May 28 to 29, 2026. If you want to learn everything from the fundamentals of Claude Code to advanced production workflows, including working with large codebases, this could be a great way to level up.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/claude-c1-substack&quot;,&quot;text&quot;:&quot;Check it out now&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://go.bytebytego.com/claude-c1-substack"><span>Check it out now</span></a></p>]]></content:encoded></item><item><title><![CDATA[A Guide to Async Patterns in API Design]]></title><description><![CDATA[In this article, we will look at each of these patterns in detail, along with their advantages.]]></description><link>https://blog.bytebytego.com/p/a-guide-to-async-patterns-in-api</link><guid isPermaLink="false">https://blog.bytebytego.com/p/a-guide-to-async-patterns-in-api</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Thu, 21 May 2026 15:30:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!LYCw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f47049-9f51-40f2-a453-794a63b7d322_2250x2624.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p style="text-align: justify;">The default model for client-server communication is request-response. The client sends a request, the server returns a response, and the connection closes. This handles the majority of what most software needs to do, and it covers an enormous range of practical scenarios in modern web applications.</p><p style="text-align: justify;">It doesn&#8217;t handle everything. Some work takes too long to complete inside a single request. Some events happen on the server&#8217;s schedule, not the client&#8217;s. Some interactions are continuous rather than one-shot. And some messages need to outlive the moment they were sent.</p><p style="text-align: justify;">Async API patterns are the techniques engineers use for these cases, and the list has grown over the years. It now includes short polling, long polling, server-sent events, WebSockets, webhooks, async APIs with status polling, message queues, and GraphQL subscriptions. Each has its own design and preferred use case. What they share is one purpose, which is to extend what&#8217;s possible beyond a single HTTP request and response.</p><p style="text-align: justify;">In this article, we will look at each of these patterns in detail, along with their advantages. We&#8217;ll start by looking at where request-response stops fitting and then walk through each pattern in turn.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LYCw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f47049-9f51-40f2-a453-794a63b7d322_2250x2624.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LYCw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f47049-9f51-40f2-a453-794a63b7d322_2250x2624.png 424w, https://substackcdn.com/image/fetch/$s_!LYCw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f47049-9f51-40f2-a453-794a63b7d322_2250x2624.png 848w, https://substackcdn.com/image/fetch/$s_!LYCw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f47049-9f51-40f2-a453-794a63b7d322_2250x2624.png 1272w, https://substackcdn.com/image/fetch/$s_!LYCw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f47049-9f51-40f2-a453-794a63b7d322_2250x2624.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LYCw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f47049-9f51-40f2-a453-794a63b7d322_2250x2624.png" width="1456" height="1698" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/76f47049-9f51-40f2-a453-794a63b7d322_2250x2624.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1698,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:336210,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197567426?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f47049-9f51-40f2-a453-794a63b7d322_2250x2624.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LYCw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f47049-9f51-40f2-a453-794a63b7d322_2250x2624.png 424w, https://substackcdn.com/image/fetch/$s_!LYCw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f47049-9f51-40f2-a453-794a63b7d322_2250x2624.png 848w, https://substackcdn.com/image/fetch/$s_!LYCw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f47049-9f51-40f2-a453-794a63b7d322_2250x2624.png 1272w, https://substackcdn.com/image/fetch/$s_!LYCw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76f47049-9f51-40f2-a453-794a63b7d322_2250x2624.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2 style="text-align: justify;">Request-Response and Its Limits</h2>
      <p>
          <a href="https://blog.bytebytego.com/p/a-guide-to-async-patterns-in-api">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[How Netflix is Using Multimodal AI to Power Video Search]]></title><description><![CDATA[In this article, we will understand how Netflix built this system and the challenges it faced.]]></description><link>https://blog.bytebytego.com/p/how-netflix-is-using-multimodal-ai</link><guid isPermaLink="false">https://blog.bytebytego.com/p/how-netflix-is-using-multimodal-ai</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Wed, 20 May 2026 15:31:07 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!HhTb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb1d6d38-6031-4c49-aab9-5b176e76830f_2534x1734.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><h2><a href="https://go.bytebytego.com/Orkes_052026">Build Durable Agents With Open Source Frameworks (Sponsored)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/Orkes_052026" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6PTJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b4884c-57e8-43b2-8f42-49acf41e45ea_2480x3508.png 424w, https://substackcdn.com/image/fetch/$s_!6PTJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b4884c-57e8-43b2-8f42-49acf41e45ea_2480x3508.png 848w, https://substackcdn.com/image/fetch/$s_!6PTJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b4884c-57e8-43b2-8f42-49acf41e45ea_2480x3508.png 1272w, https://substackcdn.com/image/fetch/$s_!6PTJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b4884c-57e8-43b2-8f42-49acf41e45ea_2480x3508.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6PTJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b4884c-57e8-43b2-8f42-49acf41e45ea_2480x3508.png" width="1456" height="2060" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6b4884c-57e8-43b2-8f42-49acf41e45ea_2480x3508.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2060,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1382409,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://go.bytebytego.com/Orkes_052026&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814603?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b4884c-57e8-43b2-8f42-49acf41e45ea_2480x3508.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6PTJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b4884c-57e8-43b2-8f42-49acf41e45ea_2480x3508.png 424w, https://substackcdn.com/image/fetch/$s_!6PTJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b4884c-57e8-43b2-8f42-49acf41e45ea_2480x3508.png 848w, https://substackcdn.com/image/fetch/$s_!6PTJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b4884c-57e8-43b2-8f42-49acf41e45ea_2480x3508.png 1272w, https://substackcdn.com/image/fetch/$s_!6PTJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6b4884c-57e8-43b2-8f42-49acf41e45ea_2480x3508.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Most AI agents work in demos &#8212; but fail in production. Learn how to build durable, enterprise-ready AI agents with open-source frameworks using Orkes Agentspan and Conductor. This whitepaper explores how to orchestrate long-running, fault-tolerant agent workflows with built-in governance, observability, retries, and human approvals. See how Agentspan compares to LangGraph, CrewAI, and AutoGen for real-world enterprise AI systems. If you&#8217;re building AI workflows that need reliability, scale, and control, this guide shows the architecture patterns that make production-grade agents possible.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/Orkes_052026&quot;,&quot;text&quot;:&quot;Download the Whitepaper&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://go.bytebytego.com/Orkes_052026"><span>Download the Whitepaper</span></a></p><div><hr></div><p style="text-align: justify;">A single season of a Netflix show can generate over 2,000 hours of raw footage. That&#8217;s 216 million frames.</p><p style="text-align: justify;">When a film editor needs to find the exact moment where a specific character says a specific line in a specific location, they&#8217;re facing one of the hardest search problems in all of software engineering. And the solution has surprisingly little to do with building a better AI model. The real challenge, it turns out, is plumbing.</p><p style="text-align: justify;">Netflix editorial teams used to lose days searching for specific moments buried in raw production footage. For example, a director might need every shot of a character in a particular setting. A marketing team might want the five most visually striking action sequences across an entire franchise. Finding these moments meant hours of manual scrubbing through thousands of hours of material. Creative momentum would stall in situations like this.</p><p style="text-align: justify;">The team that solved this problem built something that looks simple from the outside, just a search bar. But underneath it sits a three-layer pipeline that orchestrates an ensemble of AI models, fuses their outputs across a shared timeline, and serves hybrid text-and-vector queries at sub-second latency.</p><p style="text-align: justify;">When those multiple AI models run over the same footage, the baseline of 216 million frames explodes into billions of multi-layered data points. Storing, aligning, and intersecting that volume while maintaining sub-second query performance goes well beyond what any traditional database can handle alone.</p><p style="text-align: justify;">In this article, we will understand how Netflix built this system and the challenges it faced.</p><p style="text-align: justify;"><em>Disclaimer: This post is based on publicly shared details from the Netflix Engineering Team. Please comment if you notice any inaccuracies.</em></p><h2 style="text-align: justify;">Why Multiple Models</h2><p style="text-align: justify;">Why would Netflix run multiple AI models over the same footage instead of relying on one powerful model that does everything?</p><p style="text-align: justify;">This is because specialized models consistently outperform generalists at their particular task. A model trained specifically for face recognition will identify characters more accurately than a general-purpose vision model. A model tuned for scene classification will map environments more precisely. A dialogue transcription model will capture speech more reliably.</p><p style="text-align: justify;">Therefore, Netflix runs an ensemble of specialists. For example, one model recognizes characters. Another classifies scenes and environments. A third transcribes dialogue. A fourth detects objects. Each model is excellent at its job, but each one also produces a fundamentally different kind of output.</p><p style="text-align: justify;">For reference, a character recognition model might output a text label like &#8220;Joey.&#8221; In contrast, a scene classification model produces a 512-dimensional vector embedding, which is basically a list of numbers that represents the mathematical &#8220;meaning&#8221; of a scene in a way that machines can compare. On the other hand, a dialogue model outputs timestamped transcript text. These are entirely different data types, and they require different search strategies to query.</p><p style="text-align: justify;">See the diagram below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QNla!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dd6c5a-08b2-40d5-aec8-816dd7e40ef4_3540x1146.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QNla!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dd6c5a-08b2-40d5-aec8-816dd7e40ef4_3540x1146.png 424w, https://substackcdn.com/image/fetch/$s_!QNla!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dd6c5a-08b2-40d5-aec8-816dd7e40ef4_3540x1146.png 848w, https://substackcdn.com/image/fetch/$s_!QNla!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dd6c5a-08b2-40d5-aec8-816dd7e40ef4_3540x1146.png 1272w, https://substackcdn.com/image/fetch/$s_!QNla!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dd6c5a-08b2-40d5-aec8-816dd7e40ef4_3540x1146.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QNla!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dd6c5a-08b2-40d5-aec8-816dd7e40ef4_3540x1146.png" width="1456" height="471" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/41dd6c5a-08b2-40d5-aec8-816dd7e40ef4_3540x1146.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:471,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:237198,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814603?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dd6c5a-08b2-40d5-aec8-816dd7e40ef4_3540x1146.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QNla!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dd6c5a-08b2-40d5-aec8-816dd7e40ef4_3540x1146.png 424w, https://substackcdn.com/image/fetch/$s_!QNla!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dd6c5a-08b2-40d5-aec8-816dd7e40ef4_3540x1146.png 848w, https://substackcdn.com/image/fetch/$s_!QNla!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dd6c5a-08b2-40d5-aec8-816dd7e40ef4_3540x1146.png 1272w, https://substackcdn.com/image/fetch/$s_!QNla!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F41dd6c5a-08b2-40d5-aec8-816dd7e40ef4_3540x1146.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source:<strong> </strong><a href="https://netflixtechblog.com/powering-multimodal-intelligence-for-video-search-3e0020cf1202">Netflix Engineering Blog</a></figcaption></figure></div><p style="text-align: justify;">The format problem is only half the challenge. Each model also slices the video into different, overlapping time intervals. The character model might detect &#8220;Joey&#8221; from seconds 2 through 8. The scene model might detect &#8220;kitchen&#8221; from seconds 4 through 9. There is no shared timeline across models. The intervals overlap, but they don&#8217;t align.</p><p style="text-align: justify;">So, if we think about it, the engineering team had to solve one core challenge.</p><p style="text-align: justify;">How do you take all these different outputs, produced at different time resolutions, in different formats, and merge them into one searchable index?</p><p style="text-align: justify;">Netflix is also exploring a fundamentally different approach to this problem through a single unified foundation model called MediaFM that handles audio, video, and text together. Whether the future favors specialized ensembles or unified generalists remains an open question in multimodal AI. But for now, the production system relies on a three-stage pipeline that treats each concern separately.</p><h2 style="text-align: justify;">The Three-Stage Pipeline</h2><p style="text-align: justify;">The transition from raw model output to searchable intelligence follows a decoupled, three-stage process.</p><p style="text-align: justify;">Each stage handles one concern and one concern only. This separation is the architectural backbone of the entire system, and it exists because coupling any two stages would create bottlenecks at Netflix&#8217;s scale.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LgqA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6e50ac1-703b-4327-89b4-bc1ce8eac38e_2086x1938.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LgqA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6e50ac1-703b-4327-89b4-bc1ce8eac38e_2086x1938.png 424w, https://substackcdn.com/image/fetch/$s_!LgqA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6e50ac1-703b-4327-89b4-bc1ce8eac38e_2086x1938.png 848w, https://substackcdn.com/image/fetch/$s_!LgqA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6e50ac1-703b-4327-89b4-bc1ce8eac38e_2086x1938.png 1272w, https://substackcdn.com/image/fetch/$s_!LgqA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6e50ac1-703b-4327-89b4-bc1ce8eac38e_2086x1938.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LgqA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6e50ac1-703b-4327-89b4-bc1ce8eac38e_2086x1938.png" width="1456" height="1353" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b6e50ac1-703b-4327-89b4-bc1ce8eac38e_2086x1938.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1353,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:171992,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814603?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6e50ac1-703b-4327-89b4-bc1ce8eac38e_2086x1938.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LgqA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6e50ac1-703b-4327-89b4-bc1ce8eac38e_2086x1938.png 424w, https://substackcdn.com/image/fetch/$s_!LgqA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6e50ac1-703b-4327-89b4-bc1ce8eac38e_2086x1938.png 848w, https://substackcdn.com/image/fetch/$s_!LgqA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6e50ac1-703b-4327-89b4-bc1ce8eac38e_2086x1938.png 1272w, https://substackcdn.com/image/fetch/$s_!LgqA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6e50ac1-703b-4327-89b4-bc1ce8eac38e_2086x1938.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3 style="text-align: justify;">Stage 1: Transactional Persistence</h3><p style="text-align: justify;">Raw annotations from all models are ingested and stored in Apache Cassandra, a distributed database optimized for high-speed write throughput. This stage strictly prioritizes data integrity. Every piece of model output gets safely captured, with zero transformation. The system stores it exactly as the model produced it.</p><p style="text-align: justify;">Why keep this stage separate from everything that follows?</p><p style="text-align: justify;">It is because if the system tried to process or fuse data during ingestion, the heavy computation would slow down real-time intake. Decoupling ensures that no matter how many models are running or how much data they produce, the ingestion layer keeps up.</p><h3 style="text-align: justify;">Stage 2: Offline Data Fusion</h3><p style="text-align: justify;">Once raw data is safely persisted, an event triggers an asynchronous processing job. This offline fusion layer is the architectural heart of the system. It handles the heavy computational work outside the real-time path, so complex data intersections never interfere with ongoing ingestion.</p><p style="text-align: justify;">The key technique here is temporal bucketing.</p><p style="text-align: justify;">See the diagram below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HhTb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb1d6d38-6031-4c49-aab9-5b176e76830f_2534x1734.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HhTb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb1d6d38-6031-4c49-aab9-5b176e76830f_2534x1734.png 424w, https://substackcdn.com/image/fetch/$s_!HhTb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb1d6d38-6031-4c49-aab9-5b176e76830f_2534x1734.png 848w, https://substackcdn.com/image/fetch/$s_!HhTb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb1d6d38-6031-4c49-aab9-5b176e76830f_2534x1734.png 1272w, https://substackcdn.com/image/fetch/$s_!HhTb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb1d6d38-6031-4c49-aab9-5b176e76830f_2534x1734.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HhTb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb1d6d38-6031-4c49-aab9-5b176e76830f_2534x1734.png" width="1456" height="996" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eb1d6d38-6031-4c49-aab9-5b176e76830f_2534x1734.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:996,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:163782,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814603?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb1d6d38-6031-4c49-aab9-5b176e76830f_2534x1734.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HhTb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb1d6d38-6031-4c49-aab9-5b176e76830f_2534x1734.png 424w, https://substackcdn.com/image/fetch/$s_!HhTb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb1d6d38-6031-4c49-aab9-5b176e76830f_2534x1734.png 848w, https://substackcdn.com/image/fetch/$s_!HhTb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb1d6d38-6031-4c49-aab9-5b176e76830f_2534x1734.png 1272w, https://substackcdn.com/image/fetch/$s_!HhTb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb1d6d38-6031-4c49-aab9-5b176e76830f_2534x1734.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The pipeline normalizes all model outputs by mapping them into fixed one-second intervals. This unfolds in three steps:</p><ul><li><p style="text-align: justify;">First is bucket mapping. Continuous detections get segmented into discrete one-second intervals. If the character model detects &#8220;Joey&#8221; from seconds 2 through 8, the pipeline maps that continuous span into seven distinct one-second buckets.</p></li><li><p style="text-align: justify;">Second is annotation intersection. When multiple models produce annotations for the same one-second bucket, the system fuses them into a single comprehensive record. If &#8220;Joey&#8221; and &#8220;kitchen&#8221; both appear in the bucket covering second 4 to second 5, they get merged into one record that says &#8220;Joey is in a kitchen during this specific second of footage.&#8221;</p></li><li><p style="text-align: justify;">Third is optimized persistence. These enriched, fused records are written back to Cassandra as distinct entities. The result is a second-by-second index of multi-modal intersections, precisely associating every fused annotation with its source video asset.</p></li></ul><p style="text-align: justify;">One important detail makes this process incremental.</p><p style="text-align: justify;">The pipeline uses upsert operations, meaning it will update an existing record if one is found or insert a new one if it isn&#8217;t, using a composite key that combines the asset ID and the time bucket as the unique identifier.</p><p style="text-align: justify;">If a temporal bucket already exists for a specific second of video, perhaps populated by an earlier model run, the system updates the existing record rather than creating a duplicate. This establishes a single source of truth for every second of footage, and it means the system gracefully handles new models being added over time.</p><p style="text-align: justify;">The one-second bucket size is itself a meaningful design decision.</p><p style="text-align: justify;">Smaller buckets mean finer temporal precision but dramatically more records. At one-second resolution, a 2,000-hour archive produces 7.2 million buckets, each potentially containing multiple annotations from multiple models. Netflix chose one second as the balance point between precision and manageability.</p><h3 style="text-align: justify;">Stage 3: Indexing for Real-Time Search</h3><p style="text-align: justify;">Once the enriched temporal buckets are persisted, a subsequent event triggers their ingestion into Elasticsearch, the system&#8217;s query engine.</p><p style="text-align: justify;">Each temporal bucket is structured as a nested document:</p><ul><li><p style="text-align: justify;">The parent level captures the overarching asset context, including the asset ID, the movie ID, and the time interval.</p></li><li><p style="text-align: justify;">Child documents within it house the specific multi-modal annotations like character data, scene embeddings, and dialogue text. This hierarchical structure is what makes cross-annotation queries possible.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!n9Ki!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f535115-2834-43c2-8fef-f48c6e907c0e_1950x1062.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!n9Ki!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f535115-2834-43c2-8fef-f48c6e907c0e_1950x1062.png 424w, https://substackcdn.com/image/fetch/$s_!n9Ki!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f535115-2834-43c2-8fef-f48c6e907c0e_1950x1062.png 848w, https://substackcdn.com/image/fetch/$s_!n9Ki!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f535115-2834-43c2-8fef-f48c6e907c0e_1950x1062.png 1272w, https://substackcdn.com/image/fetch/$s_!n9Ki!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f535115-2834-43c2-8fef-f48c6e907c0e_1950x1062.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!n9Ki!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f535115-2834-43c2-8fef-f48c6e907c0e_1950x1062.png" width="1456" height="793" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9f535115-2834-43c2-8fef-f48c6e907c0e_1950x1062.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:793,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:73337,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814603?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f535115-2834-43c2-8fef-f48c6e907c0e_1950x1062.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!n9Ki!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f535115-2834-43c2-8fef-f48c6e907c0e_1950x1062.png 424w, https://substackcdn.com/image/fetch/$s_!n9Ki!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f535115-2834-43c2-8fef-f48c6e907c0e_1950x1062.png 848w, https://substackcdn.com/image/fetch/$s_!n9Ki!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f535115-2834-43c2-8fef-f48c6e907c0e_1950x1062.png 1272w, https://substackcdn.com/image/fetch/$s_!n9Ki!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f535115-2834-43c2-8fef-f48c6e907c0e_1950x1062.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source:<strong> </strong><a href="https://netflixtechblog.com/powering-multimodal-intelligence-for-video-search-3e0020cf1202">Netflix Engineering Blog</a></figcaption></figure></div><p style="text-align: justify;">When a user searches for &#8220;Joey in the kitchen,&#8221; Elasticsearch can match the character annotation and the scene annotation within the same parent bucket in a single query.</p><h2 style="text-align: justify;">Two Kinds of Search with One Result</h2><p style="text-align: justify;">With a fused, indexed timeline in place, the system is ready for the part users actually see.</p><p style="text-align: justify;">When a query arrives, the system runs a three-step preprocessing phase before touching the index.</p><ul><li><p style="text-align: justify;">Query type detection dynamically categorizes the request to route it down the most efficient retrieval path.</p></li><li><p style="text-align: justify;">Filter extraction isolates semantic constraints like character names or environmental contexts to narrow the candidate pool before expensive computation begins</p></li><li><p style="text-align: justify;">Lastly, vector transformation converts the raw text query into high-dimensional, model-specific embeddings for semantic matching.</p></li></ul><p style="text-align: justify;">The system then compiles this structured plan into an optimized Elasticsearch query and executes it against the pre-fused temporal buckets.</p><h3 style="text-align: justify;">Hybrid Search</h3><p style="text-align: justify;">A query like &#8220;Joey in the kitchen&#8221; requires two fundamentally different kinds of matching.</p><p style="text-align: justify;">&#8220;Joey&#8221; is a proper noun that demands exact keyword matching. &#8220;Kitchen&#8221; is a semantic concept that benefits from vector similarity search, where the system compares the mathematical distance between the query embedding and scene embeddings stored in the index.</p><p style="text-align: justify;">A keyword search alone would miss scenes labeled with related terms. Also, vector search alone would struggle with proper nouns and exact phrases. The combination of both is called hybrid search, and it consistently outperforms either approach in isolation.</p><p style="text-align: justify;">Netflix gives users fine-grained control over this hybrid engine.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZLYy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c5e6f3e-99b8-44a6-a647-996f81767045_2086x1938.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZLYy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c5e6f3e-99b8-44a6-a647-996f81767045_2086x1938.png 424w, https://substackcdn.com/image/fetch/$s_!ZLYy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c5e6f3e-99b8-44a6-a647-996f81767045_2086x1938.png 848w, https://substackcdn.com/image/fetch/$s_!ZLYy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c5e6f3e-99b8-44a6-a647-996f81767045_2086x1938.png 1272w, https://substackcdn.com/image/fetch/$s_!ZLYy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c5e6f3e-99b8-44a6-a647-996f81767045_2086x1938.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZLYy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c5e6f3e-99b8-44a6-a647-996f81767045_2086x1938.png" width="1456" height="1353" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9c5e6f3e-99b8-44a6-a647-996f81767045_2086x1938.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1353,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:150318,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814603?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c5e6f3e-99b8-44a6-a647-996f81767045_2086x1938.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZLYy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c5e6f3e-99b8-44a6-a647-996f81767045_2086x1938.png 424w, https://substackcdn.com/image/fetch/$s_!ZLYy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c5e6f3e-99b8-44a6-a647-996f81767045_2086x1938.png 848w, https://substackcdn.com/image/fetch/$s_!ZLYy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c5e6f3e-99b8-44a6-a647-996f81767045_2086x1938.png 1272w, https://substackcdn.com/image/fetch/$s_!ZLYy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c5e6f3e-99b8-44a6-a647-996f81767045_2086x1938.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">They can toggle between exact k-Nearest Neighbor search, which guarantees the mathematically closest matches but is computationally expensive, and Approximate Nearest Neighbor algorithms that trade a small amount of accuracy for significantly faster results on massive datasets.</p><p style="text-align: justify;">They can choose between distance metrics like cosine similarity and Euclidean distance, because different models shape their vector spaces differently, and what counts as &#8220;close&#8221; depends on how the model was trained. They can also set confidence thresholds, meaning minimum score boundaries that filter out low-probability matches so that creative teams only review results meeting a high standard of relevance.</p><h3 style="text-align: justify;">Dialogue Search and Text Analysis</h3><p style="text-align: justify;">For searches involving specific lines of speech, the system applies a layered text analysis strategy.</p><p style="text-align: justify;">Phrase matching with a configurable &#8220;slop&#8221; parameter, which controls how many words apart the search terms can appear and still count as a match, handles imperfect human memory. For example, if a user searches for a line from Stranger Things but remembers the wording slightly wrong, the system still finds the right scene.</p><p style="text-align: justify;">Search-as-you-type functionality, powered by indexing partial word fragments at ingest time, surfaces frame-accurate results the moment an editor begins typing.</p><p style="text-align: justify;">Linguistic stemming across multiple languages ensures that &#8220;running&#8221; matches scenes tagged with &#8220;run&#8221; or &#8220;ran,&#8221; collapsing grammatical variations into a single search intent. Fuzzy matching that tolerates character-level typos and misspellings accounts for transcription errors, so that high-value shots are never lost to minor data imperfections.</p><h3 style="text-align: justify;">Curating the Results</h3><p style="text-align: justify;">Raw search results need post-processing before they&#8217;re useful.</p><p style="text-align: justify;">The system uses custom aggregations to cluster outputs, such as isolating the top 5 most relevant clips of an actor per episode. This prevents any single asset from dominating results and combats the fatigue that comes with reviewing hundreds of near-identical frames. The temporal reconstruction layer converts internal bucket boundaries back into natural scene boundaries, so editors see coherent scene-level results rather than arbitrary one-second slices.</p><p style="text-align: justify;">The system also provides two result modes depending on the query intent. Union mode returns the full span of all matching annotations, prioritizing breadth and capturing any instance where a specified feature appears. Intersection mode returns only the exact overlapping duration where all criteria co-occur, prioritizing precision. The choice between them lies with the user.</p><h2 style="text-align: justify;">What This Architecture Costs</h2><p style="text-align: justify;">Every architectural choice Netflix made carries a tradeoff, and the team was deliberate about which prices they were willing to pay.</p><p style="text-align: justify;">Offline fusion means new content goes through a delay before it becomes fully searchable across all modalities. Netflix chose throughput over real-time freshness because the alternative, fusing data during ingestion, would bottleneck the entire pipeline.</p><p style="text-align: justify;">For a production archive that grows by thousands of hours, that tradeoff makes sense. However, for a system that needed instant searchability, it would be the wrong call.</p><p style="text-align: justify;">The toggle between exact and approximate nearest neighbor search is a direct precision-versus-speed tradeoff. Exact k-NN guarantees the mathematically best matches but becomes computationally expensive at scale. Approximate methods are faster but accept the possibility of missing some relevant results. Netflix surfaces this tradeoff to users rather than choosing them.</p><p style="text-align: justify;">And the ensemble approach itself is a bet.</p><p style="text-align: justify;">Running multiple specialized models and fusing their outputs through a three-stage pipeline produces excellent per-task accuracy, but it demands significant infrastructure complexity.</p><p style="text-align: justify;">A single unified model would simplify the architecture at the potential cost of accuracy on specialized tasks. The fact that Netflix is simultaneously investing in both approaches, the ensemble pipeline and the MediaFM foundation model, suggests this tradeoff remains genuinely unsettled.</p><h2 style="text-align: justify;">Conclusion</h2><p style="text-align: justify;">The current system implemented by the Netflix engineering team is the first phase of a larger vision. Three planned evolutions stand out:</p><ul><li><p style="text-align: justify;">First is natural language discovery. The system currently accepts structured query payloads, but the goal is to move toward fluid, conversational interfaces where an editor could type something like &#8220;Find the best tracking shots of Tom Holland running on a roof&#8221; and get results without needing to understand the underlying query structure.</p></li><li><p style="text-align: justify;">Second is adaptive ranking. By building machine learning feedback loops that analyze how editorial teams interact with and select clips, the system would gradually self-tune its mathematical definition of relevance. The search engine would get better over time, learning from actual usage patterns rather than relying on static scoring algorithms.</p></li><li><p style="text-align: justify;">Third is domain-specific personalization. A team cutting high-action marketing trailers has different relevance criteria than a team editing narrative scenes or conducting deep archival research. The system would dynamically calibrate search weights and retrieval behaviors to match the user&#8217;s context.</p></li></ul><p style="text-align: justify;">These extensions point toward a broader ambition of evolving from a search engine into an intelligent creative partner.</p><p style="text-align: justify;">However, the current system already teaches a valuable architectural lesson.</p><p style="text-align: justify;">When multiple AI models produce different kinds of data about the same underlying entity, the hardest engineering lies in the fusion layer. The models themselves are important, but the pipeline that persists, aligns, and indexes their outputs is what makes the whole system work.</p><p style="text-align: justify;"><strong>Reference:</strong></p><ul><li><p style="text-align: justify;"><a href="https://netflixtechblog.com/powering-multimodal-intelligence-for-video-search-3e0020cf1202">Synchronizing the Senses: Powering Multimodal Intelligence for Video Search</a></p></li></ul>]]></content:encoded></item><item><title><![CDATA[How Snapchat Serves a Billion Predictions Per Second]]></title><description><![CDATA[For Snap, machine learning is closer to the product itself than a feature on top of it.]]></description><link>https://blog.bytebytego.com/p/how-snapchat-serves-a-billion-predictions</link><guid isPermaLink="false">https://blog.bytebytego.com/p/how-snapchat-serves-a-billion-predictions</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Tue, 19 May 2026 15:31:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!hoM-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96aafc0-9a55-47ba-824b-701de0e4f318_4348x2704.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><strong><a href="https://go.bytebytego.com/You_051926">New Year, New Metrics: Evaluating AI Search in the Agentic Era (Sponsored)</a></strong></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/You_051926" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8ZPR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e07ebdc-da60-480c-874b-162a215a186b_1600x840.png 424w, https://substackcdn.com/image/fetch/$s_!8ZPR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e07ebdc-da60-480c-874b-162a215a186b_1600x840.png 848w, https://substackcdn.com/image/fetch/$s_!8ZPR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e07ebdc-da60-480c-874b-162a215a186b_1600x840.png 1272w, https://substackcdn.com/image/fetch/$s_!8ZPR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e07ebdc-da60-480c-874b-162a215a186b_1600x840.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8ZPR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e07ebdc-da60-480c-874b-162a215a186b_1600x840.png" width="1456" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3e07ebdc-da60-480c-874b-162a215a186b_1600x840.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1432342,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://go.bytebytego.com/You_051926&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/183299050?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e07ebdc-da60-480c-874b-162a215a186b_1600x840.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!8ZPR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e07ebdc-da60-480c-874b-162a215a186b_1600x840.png 424w, https://substackcdn.com/image/fetch/$s_!8ZPR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e07ebdc-da60-480c-874b-162a215a186b_1600x840.png 848w, https://substackcdn.com/image/fetch/$s_!8ZPR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e07ebdc-da60-480c-874b-162a215a186b_1600x840.png 1272w, https://substackcdn.com/image/fetch/$s_!8ZPR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e07ebdc-da60-480c-874b-162a215a186b_1600x840.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Most teams pick a search provider by running a few test queries and hoping for the best &#8211; a recipe for hallucinations and unpredictable failures. <a href="https://go.bytebytego.com/You_050526">This technical guide</a> from <a href="https://go.bytebytego.com/You_050526">You.com</a> gives you access to an exact framework to evaluate AI search and retrieval.</p><p><strong>What you&#8217;ll get:</strong></p><ul><li><p>A four-phase framework for evaluating AI search</p></li><li><p>How to build a golden set of queries that predicts real-world performance</p></li><li><p>Metrics and code for measuring accuracy</p></li></ul><p>Go from &#8220;looks good&#8221; to proven quality.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/You_051926&quot;,&quot;text&quot;:&quot;Learn how to run an eval&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://go.bytebytego.com/You_051926"><span>Learn how to run an eval</span></a></p><div><hr></div><p style="text-align: justify;">Snapchat decides what to show you in roughly 100 milliseconds. In that window, the system has to retrieve a few hundred candidate videos from a corpus of millions, fetch dozens of features about the user and dozens about each candidate, run a deep learning model over every pair, and rank the results. Now consider how this scales when 477 million people open the app every day.</p><p style="text-align: justify;">The interesting question is how the system stays fast at this scale. The deeper question is what shape the system has to have to be fast at all.</p><p style="text-align: justify;">Snapchat started in 2011 as an ephemeral messaging app where photos disappeared after a few seconds, and it has since grown into a full social platform with Discover, Spotlight, augmented reality lenses, friend suggestions, and an ad business that funds most of the company. Snap reported 946 million monthly active users in late 2025, with about 474 million of them opening the app every day. India is the largest market with over 214 million users, followed by the United States with around 104 million, while France, the Gulf countries, and other regions make up the rest of a global footprint that spans every major social media market.</p><p style="text-align: justify;">For Snap, machine learning is closer to the product itself than a feature on top of it. Every session forces the system to make four kinds of decisions.</p><ul><li><p style="text-align: justify;">The first is which content should appear in your Discover and Spotlight feeds.</p></li><li><p style="text-align: justify;">The second is which ads should win the auction for your attention.</p></li><li><p style="text-align: justify;">The third is which people should appear in your friend suggestions.</p></li><li><p style="text-align: justify;">The fourth is which AR lenses and effects should surface for you.</p></li></ul><p style="text-align: justify;">Each decision is shaped by an ML model, each one happens in milliseconds, and each one can be wrong in expensive ways. A bad ad ranking costs revenue directly, while a bad recommendation costs engagement, which costs future revenue.</p><p style="text-align: justify;">All of this runs on a single platform called Bento. In this article, we&#8217;ll look at how the Snap engineering team built Bento and the challenges they faced along the way.</p><p style="text-align: justify;"><em>Disclaimer: This post is based on publicly shared details from the Snap Engineering Team. Please comment if you notice any inaccuracies.</em></p><h2 style="text-align: justify;">The Shape of a Ranking Workload</h2><p style="text-align: justify;">A typical web request is roughly symmetric. One request arrives, the server queries a database, builds a response, and sends it back. The shape is one-to-one.</p><p style="text-align: justify;">A ranking request is asymmetric. One user opens the app, and the platform has to decide what to show them out of millions of options. Internally, this single request expands into hundreds or thousands of pairs of (user, candidate) that each need a score from a model. After scoring, the system ranks the candidates and returns the top few to the user. A single request comes in, hundreds or thousands of model evaluations happen, and a short ranked list goes back out.</p><p style="text-align: justify;">This expansion is what shapes Bento. Almost every architectural decision in the platform is an answer to the question of how to absorb this fanout.</p><p style="text-align: justify;">In practice, the work is split across two stages.</p><ul><li><p style="text-align: justify;">The first stage is retrieval, where cheap models filter the full corpus down to a few hundred or thousands of candidates worth scoring.</p></li><li><p style="text-align: justify;">The second stage is ranking, where expensive models score those candidates carefully and produce the final order.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VjbY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9dbcf485-afa8-48c2-a72f-8233c72e0a89_3486x1504.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VjbY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9dbcf485-afa8-48c2-a72f-8233c72e0a89_3486x1504.png 424w, https://substackcdn.com/image/fetch/$s_!VjbY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9dbcf485-afa8-48c2-a72f-8233c72e0a89_3486x1504.png 848w, https://substackcdn.com/image/fetch/$s_!VjbY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9dbcf485-afa8-48c2-a72f-8233c72e0a89_3486x1504.png 1272w, https://substackcdn.com/image/fetch/$s_!VjbY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9dbcf485-afa8-48c2-a72f-8233c72e0a89_3486x1504.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VjbY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9dbcf485-afa8-48c2-a72f-8233c72e0a89_3486x1504.png" width="1456" height="628" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9dbcf485-afa8-48c2-a72f-8233c72e0a89_3486x1504.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:628,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:129429,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814482?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9dbcf485-afa8-48c2-a72f-8233c72e0a89_3486x1504.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VjbY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9dbcf485-afa8-48c2-a72f-8233c72e0a89_3486x1504.png 424w, https://substackcdn.com/image/fetch/$s_!VjbY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9dbcf485-afa8-48c2-a72f-8233c72e0a89_3486x1504.png 848w, https://substackcdn.com/image/fetch/$s_!VjbY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9dbcf485-afa8-48c2-a72f-8233c72e0a89_3486x1504.png 1272w, https://substackcdn.com/image/fetch/$s_!VjbY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9dbcf485-afa8-48c2-a72f-8233c72e0a89_3486x1504.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Snap&#8217;s ad ranking system follows this pattern explicitly. Light models filter the eligible ad inventory, heavy models predict the probability of conversion and engagement, an auction picks the winner, and the winning ad is served. The user&#8217;s response to that ad, whether they click, dismiss, or watch, then flows back into the training data.</p><p style="text-align: justify;">The math gets large quickly. If hundreds of millions of users each trigger a few ranking requests per session, and each request scores hundreds of candidates, the total prediction volume crosses a billion per second. Snap reports that exact number, along with 1 TB per second of feature reads and 10 trillion events per day flowing through the feature pipelines.</p><p style="text-align: justify;">This design creates four distinct kinds of pressure on the platform:</p><ul><li><p style="text-align: justify;">Latency pressure comes from the simple fact that users will abandon the app if a feed takes too long to load.</p></li><li><p style="text-align: justify;">Scale pressure comes from the sheer prediction volume itself.</p></li><li><p style="text-align: justify;">Freshness pressure comes from the requirement that a user who just liked a video should immediately see the system respond to that signal.</p></li><li><p style="text-align: justify;">Iteration pressure comes from the need for ML engineers to ship hundreds of experiments per month to keep the models competitive.</p></li></ul><p style="text-align: justify;">These pressures pull in different directions, since latency wants smaller models, scale wants cheaper compute, freshness wants real-time pipelines, and iteration wants flexible tooling. The point of Bento is to make all four tractable at the same time.</p><p style="text-align: justify;">The platform splits cleanly into two halves. One half produces models, while the other half serves them. Almost all of the unusual engineering lives in the second half.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hoM-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96aafc0-9a55-47ba-824b-701de0e4f318_4348x2704.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hoM-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96aafc0-9a55-47ba-824b-701de0e4f318_4348x2704.png 424w, https://substackcdn.com/image/fetch/$s_!hoM-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96aafc0-9a55-47ba-824b-701de0e4f318_4348x2704.png 848w, https://substackcdn.com/image/fetch/$s_!hoM-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96aafc0-9a55-47ba-824b-701de0e4f318_4348x2704.png 1272w, https://substackcdn.com/image/fetch/$s_!hoM-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96aafc0-9a55-47ba-824b-701de0e4f318_4348x2704.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hoM-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96aafc0-9a55-47ba-824b-701de0e4f318_4348x2704.png" width="1456" height="905" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e96aafc0-9a55-47ba-824b-701de0e4f318_4348x2704.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:905,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:456939,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814482?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96aafc0-9a55-47ba-824b-701de0e4f318_4348x2704.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hoM-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96aafc0-9a55-47ba-824b-701de0e4f318_4348x2704.png 424w, https://substackcdn.com/image/fetch/$s_!hoM-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96aafc0-9a55-47ba-824b-701de0e4f318_4348x2704.png 848w, https://substackcdn.com/image/fetch/$s_!hoM-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96aafc0-9a55-47ba-824b-701de0e4f318_4348x2704.png 1272w, https://substackcdn.com/image/fetch/$s_!hoM-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe96aafc0-9a55-47ba-824b-701de0e4f318_4348x2704.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2 style="text-align: justify;">The Training Pipeline</h2><p style="text-align: justify;">The training half of Bento follows a familiar four-stage workflow.</p><p style="text-align: justify;">Training data is generated from raw events and aggregated features. The model trains on GPU or TPU hosts, the trained model is evaluated against held-out data, and finally, the model is exported into a form ready for serving. Bento orchestrates these stages using Kubeflow, an open-source workflow engine built for ML pipelines.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9QDm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b48ee52-0b0e-429b-94b7-4d44f2148828_2860x1084.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9QDm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b48ee52-0b0e-429b-94b7-4d44f2148828_2860x1084.png 424w, https://substackcdn.com/image/fetch/$s_!9QDm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b48ee52-0b0e-429b-94b7-4d44f2148828_2860x1084.png 848w, https://substackcdn.com/image/fetch/$s_!9QDm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b48ee52-0b0e-429b-94b7-4d44f2148828_2860x1084.png 1272w, https://substackcdn.com/image/fetch/$s_!9QDm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b48ee52-0b0e-429b-94b7-4d44f2148828_2860x1084.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9QDm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b48ee52-0b0e-429b-94b7-4d44f2148828_2860x1084.png" width="1456" height="552" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8b48ee52-0b0e-429b-94b7-4d44f2148828_2860x1084.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:552,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:61182,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814482?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b48ee52-0b0e-429b-94b7-4d44f2148828_2860x1084.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9QDm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b48ee52-0b0e-429b-94b7-4d44f2148828_2860x1084.png 424w, https://substackcdn.com/image/fetch/$s_!9QDm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b48ee52-0b0e-429b-94b7-4d44f2148828_2860x1084.png 848w, https://substackcdn.com/image/fetch/$s_!9QDm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b48ee52-0b0e-429b-94b7-4d44f2148828_2860x1084.png 1272w, https://substackcdn.com/image/fetch/$s_!9QDm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b48ee52-0b0e-429b-94b7-4d44f2148828_2860x1084.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The interesting design choice in this half is how Snap structures the training code itself. Rather than letting every team write its own model code from scratch, Bento splits training applications into three layers.</p><ul><li><p style="text-align: justify;">The Core framework is a shared internal library built on TensorFlow and Keras that standardizes common patterns for ranking and recommendation models.</p></li><li><p style="text-align: justify;">User model code is what an individual ML engineer writes to express their specific model.</p></li><li><p style="text-align: justify;">Training configuration is a YAML file that specifies the hardware, the input data, and the runtime options.</p></li></ul><p style="text-align: justify;">This layering is what enables hundreds of experiments per day. An engineer can change one line in the configuration to swap input datasets, or modify a few lines of model code to test a new feature, and trigger a full training run. The shared Core framework means experiments are comparable, because they share the same scaffolding, while the configuration layer means experiments are cheap to launch.</p><p style="text-align: justify;">The model export step is where Bento does something different. Modern recommendation models have an unusual computational shape. They include large embedding tables, where each user ID or video ID maps to a learned vector, and these tables are bound by memory size. They also include dense neural network layers on top of those embeddings, which are bound by compute capacity. Running both on the same hardware wastes one resource or the other.</p><p style="text-align: justify;">Bento&#8217;s export step splits the compute graph, putting dense matrix multiplication on the GPU and embedding lookups along with feature parsing on the CPU. In other words, the same trained model produces different exported versions for different inference hardware.</p><p style="text-align: justify;">Models go through this process repeatedly rather than once. Bento fully automates incremental training, where new events are continuously appended to the training data, models retrain on the updated data, and new versions deploy automatically after passing validation. A model in production is materially different from the same model a week earlier.</p><p style="text-align: justify;">This whole process produces a trained model that is ready for serving, which is where the harder problem begins.</p><h2 style="text-align: justify;">The Serving Path</h2><p style="text-align: justify;">The serving half of Bento is where the asymmetric workload from earlier becomes a set of concrete engineering problems.</p><p style="text-align: justify;">A request comes in, features have to be fetched, the model has to score hundreds or thousands of candidates, and the result has to come back within the latency budget. Each of these steps presents its own challenges, and Bento&#8217;s design reflects opinionated choices about how to handle them.</p><p style="text-align: justify;">The most consequential of those choices involves the feature store, which sits between the offline world where models are trained and the online world where they are served.</p><h2 style="text-align: justify;">The Feature Store Split</h2><p style="text-align: justify;">A feature is a numerical input to a model derived from raw data. A simple example is the number of videos a user watched in the last 24 hours. More complex features might involve embeddings learned during training, statistical aggregations over time windows, or counts grouped by various keys. A model takes dozens or hundreds of features as input and produces a prediction.</p><p style="text-align: justify;">The challenge is that features have to exist in two places at once.</p><ul><li><p style="text-align: justify;">Offline, where models are trained, features live in a large analytical database, and Snap uses Apache Iceberg for this purpose.</p></li><li><p style="text-align: justify;">Online, where models are served, features live in a fast key-value store optimized for low-latency reads.</p></li></ul><p style="text-align: justify;">These two stores must agree with each other. If the same feature is computed differently in the two places, the model will train on one distribution and serve on a different one, which produces a class of bugs called train and serve skew. The model performs well in offline evaluation and poorly in production. This problem is the central operational concern of every mature ML team, and it is rarely covered in tutorials.</p><p style="text-align: justify;">Snap&#8217;s feature platform is called Robusta, and it is built on Apache Spark.</p><p style="text-align: justify;">Robusta is responsible for keeping the two stores in sync. It processes 10 trillion events per day, computes aggregated features over sliding time windows, and writes results to both the offline Iceberg store and the online key-value store. The online feature store alone holds 800 TB of data and serves 1 TB per second of reads.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VwV3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50a931a8-1d64-41be-b97a-442cd9a6889f_2838x1874.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VwV3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50a931a8-1d64-41be-b97a-442cd9a6889f_2838x1874.png 424w, https://substackcdn.com/image/fetch/$s_!VwV3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50a931a8-1d64-41be-b97a-442cd9a6889f_2838x1874.png 848w, https://substackcdn.com/image/fetch/$s_!VwV3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50a931a8-1d64-41be-b97a-442cd9a6889f_2838x1874.png 1272w, https://substackcdn.com/image/fetch/$s_!VwV3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50a931a8-1d64-41be-b97a-442cd9a6889f_2838x1874.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VwV3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50a931a8-1d64-41be-b97a-442cd9a6889f_2838x1874.png" width="1456" height="961" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/50a931a8-1d64-41be-b97a-442cd9a6889f_2838x1874.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:961,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:206488,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814482?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50a931a8-1d64-41be-b97a-442cd9a6889f_2838x1874.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VwV3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50a931a8-1d64-41be-b97a-442cd9a6889f_2838x1874.png 424w, https://substackcdn.com/image/fetch/$s_!VwV3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50a931a8-1d64-41be-b97a-442cd9a6889f_2838x1874.png 848w, https://substackcdn.com/image/fetch/$s_!VwV3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50a931a8-1d64-41be-b97a-442cd9a6889f_2838x1874.png 1272w, https://substackcdn.com/image/fetch/$s_!VwV3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50a931a8-1d64-41be-b97a-442cd9a6889f_2838x1874.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Two Strategies for High Fanout</h2><p style="text-align: justify;">The asymmetric workload from earlier becomes a concrete problem at the feature layer. A single ranking request needs features for one user and many candidate documents, and fetching all those features over the network would be too slow.</p><p style="text-align: justify;">Bento uses two different strategies depending on the use case.</p><p style="text-align: justify;">The first strategy is unusual. For many ranking workloads, Snap collocates document features directly on the inference engine instances. When a request arrives, the system performs one user feature lookup from the central online store and forwards the request to inference. The inference engine then reads document features from local memory during scoring, which eliminates network fanout entirely. The tradeoff is that each inference instance has to hold the full document feature corpus in memory, which is expensive. At Snap&#8217;s scale, the math works out, since the latency reduction and cost savings outweigh the memory cost. At smaller scales, this approach would be wasteful, and a remote feature store is the standard answer.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!f6kf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a35d48a-f438-4555-966d-55dec6a153c4_2372x1774.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!f6kf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a35d48a-f438-4555-966d-55dec6a153c4_2372x1774.png 424w, https://substackcdn.com/image/fetch/$s_!f6kf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a35d48a-f438-4555-966d-55dec6a153c4_2372x1774.png 848w, https://substackcdn.com/image/fetch/$s_!f6kf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a35d48a-f438-4555-966d-55dec6a153c4_2372x1774.png 1272w, https://substackcdn.com/image/fetch/$s_!f6kf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a35d48a-f438-4555-966d-55dec6a153c4_2372x1774.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!f6kf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a35d48a-f438-4555-966d-55dec6a153c4_2372x1774.png" width="1456" height="1089" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0a35d48a-f438-4555-966d-55dec6a153c4_2372x1774.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1089,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:149376,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814482?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a35d48a-f438-4555-966d-55dec6a153c4_2372x1774.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!f6kf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a35d48a-f438-4555-966d-55dec6a153c4_2372x1774.png 424w, https://substackcdn.com/image/fetch/$s_!f6kf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a35d48a-f438-4555-966d-55dec6a153c4_2372x1774.png 848w, https://substackcdn.com/image/fetch/$s_!f6kf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a35d48a-f438-4555-966d-55dec6a153c4_2372x1774.png 1272w, https://substackcdn.com/image/fetch/$s_!f6kf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a35d48a-f438-4555-966d-55dec6a153c4_2372x1774.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The second strategy handles cases where the document corpus is too large to fit on every inference instance. For these, Snap built a separate Retrieval service that performs Approximate Nearest Neighbor search, which is a fast similarity search over learned embeddings, along with inverted index lookups and forward index lookups in a single pass. The Retrieval service returns a small, pre-hydrated candidate set with features attached, ready to be sent to the inference engine.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!a-sQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe399c4ed-78d9-4f43-a6c6-ca1776960495_1928x1814.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!a-sQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe399c4ed-78d9-4f43-a6c6-ca1776960495_1928x1814.png 424w, https://substackcdn.com/image/fetch/$s_!a-sQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe399c4ed-78d9-4f43-a6c6-ca1776960495_1928x1814.png 848w, https://substackcdn.com/image/fetch/$s_!a-sQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe399c4ed-78d9-4f43-a6c6-ca1776960495_1928x1814.png 1272w, https://substackcdn.com/image/fetch/$s_!a-sQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe399c4ed-78d9-4f43-a6c6-ca1776960495_1928x1814.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!a-sQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe399c4ed-78d9-4f43-a6c6-ca1776960495_1928x1814.png" width="1456" height="1370" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e399c4ed-78d9-4f43-a6c6-ca1776960495_1928x1814.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1370,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:121920,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814482?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe399c4ed-78d9-4f43-a6c6-ca1776960495_1928x1814.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!a-sQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe399c4ed-78d9-4f43-a6c6-ca1776960495_1928x1814.png 424w, https://substackcdn.com/image/fetch/$s_!a-sQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe399c4ed-78d9-4f43-a6c6-ca1776960495_1928x1814.png 848w, https://substackcdn.com/image/fetch/$s_!a-sQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe399c4ed-78d9-4f43-a6c6-ca1776960495_1928x1814.png 1272w, https://substackcdn.com/image/fetch/$s_!a-sQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe399c4ed-78d9-4f43-a6c6-ca1776960495_1928x1814.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Both strategies are valid, and the choice between them depends on corpus size, access patterns, and how much memory is acceptable on inference instances.</p><h2>What Makes The Inference Engine Cheap</h2><p style="text-align: justify;">Once a request reaches the model, several optimizations make the actual scoring fast and economical. Two of these optimizations are worth understanding in detail.</p><p style="text-align: justify;">The first is the GPU and CPU compute graph split mentioned earlier. The model export step produces a hardware-specific version that places dense matrix math on the GPU and embedding lookups, along with feature parsing on the CPU. This split avoids two waste patterns at once. Putting embeddings on the GPU wastes scarce GPU memory on lookup tables, while putting dense math on the CPU wastes time on operations a GPU could parallelize easily. The split costs more engineering effort during export, and it pays back many times over during serving.</p><p style="text-align: justify;">The second is data plane optimization, and this is the most striking specific result in the entire blog. Bento&#8217;s engineers found that a large fraction of inference latency was being spent on serialization and deserialization of feature data, rather than on the model itself. They redesigned the inference APIs so that features could be fetched and transferred as raw bytes, with deserialization happening only inside the inference engine. Combined with custom Protobuf optimizations, this single change resulted in 2x lower latency and 10x cheaper data plane costs. The lesson is that at scale, the boring machinery of the system, including serialization, RPC framing, and network transport, often dominates the cost. Most of the cost lives outside the model itself.</p><p style="text-align: justify;">Other optimizations exist as well. Request batching, model co-location across inference fleets, and build-level tuning for the underlying hardware each contribute incrementally to performance and cost. The two optimizations described above carry most of the lesson.</p><p style="text-align: justify;">Once the prediction is made and the ranked feed is returned, the system&#8217;s job continues. The most important part of an ML platform happens after the response goes out.</p><h2 style="text-align: justify;">The Feedback Loop</h2><p style="text-align: justify;">Every prediction Bento makes is logged, along with the features used to make it. The user actions taken in response are also logged, including whether they watched the video, dismissed the ad, or sent a friend request. These logs flow back into the training data pipeline, where new training records are generated. Incremental training then runs on the updated data, new model versions are exported, and after passing validation, the deployment system rolls out the new versions while older versions retire.</p><p style="text-align: justify;">Two kinds of monitoring run continuously alongside this loop:</p><ul><li><p style="text-align: justify;">The first watches statistical properties of features and predictions over time. If the mean of a feature drifts, or the distribution of predictions shifts, the change is often a signal that something upstream has broken.</p></li><li><p style="text-align: justify;">The second kind of monitoring compares online and offline behavior directly. The same prediction is recomputed offline using the offline feature store, and the result is compared to what the online system produced.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q3ga!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef3388e9-bddd-4d14-8c01-41b7f187d0e2_2142x2006.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q3ga!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef3388e9-bddd-4d14-8c01-41b7f187d0e2_2142x2006.png 424w, https://substackcdn.com/image/fetch/$s_!q3ga!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef3388e9-bddd-4d14-8c01-41b7f187d0e2_2142x2006.png 848w, https://substackcdn.com/image/fetch/$s_!q3ga!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef3388e9-bddd-4d14-8c01-41b7f187d0e2_2142x2006.png 1272w, https://substackcdn.com/image/fetch/$s_!q3ga!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef3388e9-bddd-4d14-8c01-41b7f187d0e2_2142x2006.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q3ga!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef3388e9-bddd-4d14-8c01-41b7f187d0e2_2142x2006.png" width="1456" height="1364" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ef3388e9-bddd-4d14-8c01-41b7f187d0e2_2142x2006.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1364,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:136451,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814482?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef3388e9-bddd-4d14-8c01-41b7f187d0e2_2142x2006.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q3ga!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef3388e9-bddd-4d14-8c01-41b7f187d0e2_2142x2006.png 424w, https://substackcdn.com/image/fetch/$s_!q3ga!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef3388e9-bddd-4d14-8c01-41b7f187d0e2_2142x2006.png 848w, https://substackcdn.com/image/fetch/$s_!q3ga!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef3388e9-bddd-4d14-8c01-41b7f187d0e2_2142x2006.png 1272w, https://substackcdn.com/image/fetch/$s_!q3ga!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef3388e9-bddd-4d14-8c01-41b7f187d0e2_2142x2006.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Any discrepancy points to the train and serve skew problem from earlier.</p><p style="text-align: justify;">The deployment control plane uses a reconciliation pattern borrowed from Kubernetes. The system stores a desired state, which describes which models should be deployed, in what configuration, and on which fleets, and it continuously compares this desired state to the actual running state. Any differences are closed automatically. This approach is what makes large-scale ML deployments safe at this volume, since manual deployment at this scale would be too error-prone to be viable.</p><p style="text-align: justify;">Snap&#8217;s blog mentions that over a recent two-year period, ranking model size grew 20x and training data grew 40x. The platform absorbed this growth in the course of normal operation. That kind of scaling headroom is what a feedback loop buys you. The platform is less a fixed pipeline that produces a model and more a continuously running system that produces a stream of model versions, each one shaped by the data the previous version generated.</p><h2>Conclusion</h2><p style="text-align: justify;">Bento is built around a single observation about the work it does. Ranking requests are asymmetric, since one user request expands into hundreds or thousands of model evaluations before collapsing back into a short ranked list. This design, multiplied by 474 million daily users and the four operational pressures it creates around latency, scale, freshness, and iteration, drives almost every architectural decision in the platform.</p><p style="text-align: justify;">The platform handles the work in two halves.</p><p style="text-align: justify;">The training half generates models through a four-stage workflow, with a layered code structure that lets engineers run hundreds of experiments per day, and a model export step that splits the compute graph between GPU and CPU to match the unusual computational shape of recommendation models.</p><p style="text-align: justify;">The serving half handles the harder operational problems, including the dual existence of features in offline and online stores, the high-fanout problem solved by either feature collocation or a dedicated Retrieval service, and the inference-time optimizations that produced 2x latency reductions and 10x lower data plane costs. Around all of this runs a continuous feedback loop that turns each prediction into the next round of training data, with monitoring that watches for drift and a deployment control plane that reconciles desired and running state automatically.</p><p style="text-align: justify;">The numbers Bento operates at are large, including hundreds of models trained per day, more than 100,000 training compute hours per day, 800 TB of online feature data, 1 TB per second of feature reads, and over a billion predictions per second. These figures are interesting on their own, but they matter most as the conditions that make the architectural choices intelligible across the entire platform.</p><p style="text-align: justify;"><strong>References</strong></p><ul><li><p style="text-align: justify;"><a href="https://eng.snap.com/introducing-bento">Introducing Bento, Snap&#8217;s ML Platform</a></p></li></ul>]]></content:encoded></item><item><title><![CDATA[How Grab is Using AI Agents to Boost Team Productivity]]></title><description><![CDATA[Grab&#8217;s data engineering team had a problem that looks familiar to anyone who&#8217;s maintained shared infrastructure.]]></description><link>https://blog.bytebytego.com/p/how-grab-is-using-ai-agents-to-boost</link><guid isPermaLink="false">https://blog.bytebytego.com/p/how-grab-is-using-ai-agents-to-boost</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Mon, 18 May 2026 15:31:16 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!w053!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a3f60c2-0a22-4f92-a2c3-20162ce0bf14_2114x1374.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><a href="https://go.bytebytego.com/Datadog_051826">The developer toolkit for shipping AI features with confidence (Sponsored)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/Datadog_051826" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4m86!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccd78ac2-a943-4b21-bc99-8bcffb5ab042_2400x1256.png 424w, https://substackcdn.com/image/fetch/$s_!4m86!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccd78ac2-a943-4b21-bc99-8bcffb5ab042_2400x1256.png 848w, https://substackcdn.com/image/fetch/$s_!4m86!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccd78ac2-a943-4b21-bc99-8bcffb5ab042_2400x1256.png 1272w, https://substackcdn.com/image/fetch/$s_!4m86!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccd78ac2-a943-4b21-bc99-8bcffb5ab042_2400x1256.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4m86!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccd78ac2-a943-4b21-bc99-8bcffb5ab042_2400x1256.png" width="1456" height="762" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ccd78ac2-a943-4b21-bc99-8bcffb5ab042_2400x1256.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:762,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:366428,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://go.bytebytego.com/Datadog_051826&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197567426?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccd78ac2-a943-4b21-bc99-8bcffb5ab042_2400x1256.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!4m86!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccd78ac2-a943-4b21-bc99-8bcffb5ab042_2400x1256.png 424w, https://substackcdn.com/image/fetch/$s_!4m86!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccd78ac2-a943-4b21-bc99-8bcffb5ab042_2400x1256.png 848w, https://substackcdn.com/image/fetch/$s_!4m86!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccd78ac2-a943-4b21-bc99-8bcffb5ab042_2400x1256.png 1272w, https://substackcdn.com/image/fetch/$s_!4m86!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fccd78ac2-a943-4b21-bc99-8bcffb5ab042_2400x1256.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>With release cycles speeding up in the era of AI, developers need to move fast without losing visibility in production. Get 4 resources covering everything from catching flaky tests and pipeline bottlenecks to instrumenting LLM calls and controlling rollouts before regressions reach users.</p><p>You&#8217;ll learn how to:</p><ul><li><p>Track every CI pipeline run and cut test suite instability slowing your AI delivery cycles.</p></li><li><p>Catch LLM quality, latency, and cost issues before they surface in production.</p></li><li><p>Measure and improve release confidence as AI drives higher commit volume across your team.</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/Datadog_051826&quot;,&quot;text&quot;:&quot;Get the toolkit&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://go.bytebytego.com/Datadog_051826"><span>Get the toolkit</span></a></p><div><hr></div><p style="text-align: justify;">Grab&#8217;s data engineering team had a problem that looks familiar to anyone who&#8217;s maintained shared infrastructure. Their best engineers were spending two full days every week answering quick questions from colleagues.</p><p style="text-align: justify;">For reference, Grab is a super-app across Southeast Asia handling rides, food delivery, payments, and more. All of that activity generates enormous amounts of data, and the Analytics Data Warehouse (ADW) team is responsible for organizing and serving it to the rest of the company.</p><p style="text-align: justify;">This team manages over 15,000 tables that power roughly half of all queries in Grab&#8217;s data lake, and about 1,000 people across the company query those tables every month. Analysts, product managers, and other engineers all depend on the ADW team&#8217;s tables to do their jobs.</p><p style="text-align: justify;">That made the ADW team the librarians of Grab&#8217;s data, but also the help desk. The questions were quick to ask, such as &#8220;Why does this ID look like gibberish?&#8221; or &#8220;Can you add a column to this table?&#8221;</p><p style="text-align: justify;">However, each answer required a fragmented journey through data catalogs, manual lineage tracing, SQL validation, and log diving. So they built a multi-agent AI system to automate the investigation process. The system worked great in demos. Then they shipped it to production, and six things broke.</p><p style="text-align: justify;">But before we get to what broke and how the team handled things, let us understand what they built.</p><p style="text-align: justify;"><em>Disclaimer: This post is based on publicly shared details from the Grab Engineering Team. Please comment if you notice any inaccuracies.</em></p><h2 style="text-align: justify;">The Pattern Behind the Problem</h2><p style="text-align: justify;">The ADW team tracked the anatomy of these questions and noticed something important. While every question was different, the process of answering them was quite consistent. An engineer would search through data catalogs, trace where the data came from, validate it with SQL queries, and check pipeline logs. The questions varied, but the investigation playbook stayed the same. This consistency was a signal for a possible automation opportunity.</p><p style="text-align: justify;">Their design philosophy started with a clean separation, which they describe as decoupling the brain from the hands.</p><p style="text-align: justify;">The brain is the LLM doing the reasoning. The hands are specialized agents and tools that actually fetch information, run queries, and interact with systems. By separating these two concerns, they created a system that was both capable and easy to debug. When something went wrong, they could pinpoint whether the issue was in the reasoning or in a specific tool interaction.</p><p style="text-align: justify;">They also made a deliberate architectural bet.</p><p style="text-align: justify;">Rather than building one massive AI trained to handle every type of question, they built multiple specialized agents, each focused on a narrow domain.</p><p style="text-align: justify;">See the diagram below that shows how an AI agent works:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!w053!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a3f60c2-0a22-4f92-a2c3-20162ce0bf14_2114x1374.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w053!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a3f60c2-0a22-4f92-a2c3-20162ce0bf14_2114x1374.png 424w, https://substackcdn.com/image/fetch/$s_!w053!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a3f60c2-0a22-4f92-a2c3-20162ce0bf14_2114x1374.png 848w, https://substackcdn.com/image/fetch/$s_!w053!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a3f60c2-0a22-4f92-a2c3-20162ce0bf14_2114x1374.png 1272w, https://substackcdn.com/image/fetch/$s_!w053!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a3f60c2-0a22-4f92-a2c3-20162ce0bf14_2114x1374.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w053!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a3f60c2-0a22-4f92-a2c3-20162ce0bf14_2114x1374.png" width="1456" height="946" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7a3f60c2-0a22-4f92-a2c3-20162ce0bf14_2114x1374.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:946,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:297393,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814425?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a3f60c2-0a22-4f92-a2c3-20162ce0bf14_2114x1374.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!w053!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a3f60c2-0a22-4f92-a2c3-20162ce0bf14_2114x1374.png 424w, https://substackcdn.com/image/fetch/$s_!w053!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a3f60c2-0a22-4f92-a2c3-20162ce0bf14_2114x1374.png 848w, https://substackcdn.com/image/fetch/$s_!w053!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a3f60c2-0a22-4f92-a2c3-20162ce0bf14_2114x1374.png 1272w, https://substackcdn.com/image/fetch/$s_!w053!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7a3f60c2-0a22-4f92-a2c3-20162ce0bf14_2114x1374.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">A single monolithic model would have been simpler to deploy with one model and one inference call, but it would also be harder to debug, and any change would risk affecting everything. On the other hand, specialized agents are modular. You can improve one without touching the others, add new ones without rewriting the system, and assign clear responsibilities that make failures traceable. The tradeoff is coordination complexity and some added latency from sequential execution.</p><p style="text-align: justify;">See the comparison below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0Hdw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F453b9b03-616d-46f2-a0fa-229090a6f691_3164x1526.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0Hdw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F453b9b03-616d-46f2-a0fa-229090a6f691_3164x1526.png 424w, https://substackcdn.com/image/fetch/$s_!0Hdw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F453b9b03-616d-46f2-a0fa-229090a6f691_3164x1526.png 848w, https://substackcdn.com/image/fetch/$s_!0Hdw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F453b9b03-616d-46f2-a0fa-229090a6f691_3164x1526.png 1272w, https://substackcdn.com/image/fetch/$s_!0Hdw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F453b9b03-616d-46f2-a0fa-229090a6f691_3164x1526.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0Hdw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F453b9b03-616d-46f2-a0fa-229090a6f691_3164x1526.png" width="1456" height="702" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/453b9b03-616d-46f2-a0fa-229090a6f691_3164x1526.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:702,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:189321,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814425?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F453b9b03-616d-46f2-a0fa-229090a6f691_3164x1526.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0Hdw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F453b9b03-616d-46f2-a0fa-229090a6f691_3164x1526.png 424w, https://substackcdn.com/image/fetch/$s_!0Hdw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F453b9b03-616d-46f2-a0fa-229090a6f691_3164x1526.png 848w, https://substackcdn.com/image/fetch/$s_!0Hdw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F453b9b03-616d-46f2-a0fa-229090a6f691_3164x1526.png 1272w, https://substackcdn.com/image/fetch/$s_!0Hdw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F453b9b03-616d-46f2-a0fa-229090a6f691_3164x1526.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Grab accepted that tradeoff because maintainability and accuracy mattered more than saving a few seconds. The idea was that when you are replacing a multi-hour manual investigation, a few minutes for a precise answer is a massive improvement.</p><p style="text-align: justify;">On the tech stack side, they used FastAPI to handle incoming requests and LangGraph to manage the complex stateful logic that multi-agent collaboration requires. Simple LLM calls follow a straight line from input to output, but Grab&#8217;s agents need to loop back, ask for more information, or hand off tasks to one another, and LangGraph supports that kind of cyclical workflow. Redis handles caching and real-time session needs, while PostgreSQL stores conversation history and agent metadata as persistent memory. The agents themselves pull information from three internal platforms, which are as follows:</p><ul><li><p style="text-align: justify;">Hubble serves as a centralized metadata and data catalog.</p></li><li><p style="text-align: justify;">Genchi is a data quality observability platform that enforces data contracts.</p></li><li><p style="text-align: justify;">Lighthouse tracks pipeline execution status and health.</p></li></ul><p style="text-align: justify;">See the diagram below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!d-Di!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b57642c-036d-484c-b36f-ef1aed886767_2232x2304.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!d-Di!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b57642c-036d-484c-b36f-ef1aed886767_2232x2304.png 424w, https://substackcdn.com/image/fetch/$s_!d-Di!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b57642c-036d-484c-b36f-ef1aed886767_2232x2304.png 848w, https://substackcdn.com/image/fetch/$s_!d-Di!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b57642c-036d-484c-b36f-ef1aed886767_2232x2304.png 1272w, https://substackcdn.com/image/fetch/$s_!d-Di!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b57642c-036d-484c-b36f-ef1aed886767_2232x2304.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!d-Di!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b57642c-036d-484c-b36f-ef1aed886767_2232x2304.png" width="1456" height="1503" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4b57642c-036d-484c-b36f-ef1aed886767_2232x2304.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1503,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:413179,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814425?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b57642c-036d-484c-b36f-ef1aed886767_2232x2304.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!d-Di!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b57642c-036d-484c-b36f-ef1aed886767_2232x2304.png 424w, https://substackcdn.com/image/fetch/$s_!d-Di!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b57642c-036d-484c-b36f-ef1aed886767_2232x2304.png 848w, https://substackcdn.com/image/fetch/$s_!d-Di!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b57642c-036d-484c-b36f-ef1aed886767_2232x2304.png 1272w, https://substackcdn.com/image/fetch/$s_!d-Di!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4b57642c-036d-484c-b36f-ef1aed886767_2232x2304.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><strong>Source: </strong><a href="https://engineering.grab.com/from-firefighting-to-building">Grab Engineering Blog</a></figcaption></figure></div><p style="text-align: justify;">With the architecture in place, the next design decision was how to split the work. This split turned out to be one of the most important choices in the entire system.</p><h2>Two Pathways, Five Agents, One Supervisor</h2><p style="text-align: justify;">When a question arrives through Slack, the system first determines which of two pathways to take. This fork is the architectural backbone of the whole system, and it is based on an important principle. Read-only operations and write operations have fundamentally different risk profiles, so they deserve fundamentally different architectures.</p><p style="text-align: justify;">See the diagram below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!J_Fv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b1896cf-0b6b-46d0-9b7b-0654a1464e59_1752x2346.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!J_Fv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b1896cf-0b6b-46d0-9b7b-0654a1464e59_1752x2346.png 424w, https://substackcdn.com/image/fetch/$s_!J_Fv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b1896cf-0b6b-46d0-9b7b-0654a1464e59_1752x2346.png 848w, https://substackcdn.com/image/fetch/$s_!J_Fv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b1896cf-0b6b-46d0-9b7b-0654a1464e59_1752x2346.png 1272w, https://substackcdn.com/image/fetch/$s_!J_Fv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b1896cf-0b6b-46d0-9b7b-0654a1464e59_1752x2346.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!J_Fv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b1896cf-0b6b-46d0-9b7b-0654a1464e59_1752x2346.png" width="1456" height="1950" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9b1896cf-0b6b-46d0-9b7b-0654a1464e59_1752x2346.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1950,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:300130,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814425?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b1896cf-0b6b-46d0-9b7b-0654a1464e59_1752x2346.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!J_Fv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b1896cf-0b6b-46d0-9b7b-0654a1464e59_1752x2346.png 424w, https://substackcdn.com/image/fetch/$s_!J_Fv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b1896cf-0b6b-46d0-9b7b-0654a1464e59_1752x2346.png 848w, https://substackcdn.com/image/fetch/$s_!J_Fv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b1896cf-0b6b-46d0-9b7b-0654a1464e59_1752x2346.png 1272w, https://substackcdn.com/image/fetch/$s_!J_Fv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9b1896cf-0b6b-46d0-9b7b-0654a1464e59_1752x2346.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><strong>Source: </strong><a href="https://engineering.grab.com/from-firefighting-to-building">Grab Engineering Blog</a></figcaption></figure></div><p style="text-align: justify;">The investigation pathway handles questions like &#8220;Why does this data look wrong?&#8221; or &#8220;Where does this metric come from?&#8221; These are read-only. The system is gathering information, and the worst case is a wrong answer that gets caught in review. Four agents collaborate here as follows:</p><ul><li><p style="text-align: justify;">The Classifier is the first responder. It parses the question, extracts key entities like table names and column references, detects guardrail violations such as PII requests or out-of-scope queries, and determines which specialist agents are needed and in what sequence. It also provides reasoning for its routing decisions, which helps with debugging later.</p></li><li><p style="text-align: justify;">The Data Agent handles the actual data investigation. It enriches prompts with table and column metadata, executes queries with built-in guardrails, validates schemas to avoid unnecessary scans, and retrieves sample data.</p></li><li><p style="text-align: justify;">The Code Search Agent traces column transformations through the codebase, follows table lineage across multiple transformation steps, and generates plain-language explanations of what the code is doing.</p></li><li><p style="text-align: justify;">The On-call Agent monitors production health by searching Slack channels for outage announcements, checking observability platforms for pipeline status, and validating data quality metrics like null counts and duplicate rates.</p></li><li><p style="text-align: justify;">Once the specialist agents finish their work, the Summarizer Agent combines their findings into a coherent answer. This is more than concatenation. It handles conflicting information between agents, ensures consistency, and produces a structured response ready for human review.</p></li></ul><p style="text-align: justify;">The enhancement pathway handles requests that change things, like adding a new column or modifying aggregation logic. These are write operations that touch production pipelines, so the architecture is fundamentally more cautious.</p><p style="text-align: justify;">A single Enhancement Agent handles these requests. It reads the JIRA ticket, discovers relevant code in the repository, runs validation checks, generates schema changes and code modifications, and creates a merge request with full documentation. Users can then trigger test pipeline runs through the bot. But at every stage, a human engineer reviews and approves. This pathway is semi-automated by design because code changes to production pipelines require human judgment, and the system was built to respect that boundary.</p><p style="text-align: justify;">To see how the investigation pathway works in practice, consider a real scenario from the blog:</p><ul><li><p style="text-align: justify;">Someone messages the team on Slack and asks why the ID in the vehicles table is unreadable.</p></li><li><p style="text-align: justify;">In the old world, an engineer would spend the next couple of hours searching catalogs, tracing lineage, running SQL, and checking logs.</p></li><li><p style="text-align: justify;">With the multi-agent system, the Classifier routes the question to all three investigation agents.</p></li><li><p style="text-align: justify;">The Data Agent queries the actual data and discovers that the IDs are valid UUIDs in standard hexadecimal format. It also searches Grab&#8217;s data catalog and finds a dimension table that maps these UUIDs to human-readable vehicle names.</p></li><li><p style="text-align: justify;">The Code Search Agent traces the lineage through the codebase and confirms that the UUID format comes directly from the source system, with no Spark transformation applied along the way.</p></li><li><p style="text-align: justify;">The On-call Agent checks Airflow pipeline status, Slack channels for incidents, and data quality metrics, and finds everything healthy.</p></li><li><p style="text-align: justify;">The Summarizer pulls it all together into a clear answer. The supposed bug was actually working as designed.</p></li></ul><p style="text-align: justify;">See the diagram below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!y-i5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bc2d863-e593-422b-a450-a6825d7cdfa7_2086x2226.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!y-i5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bc2d863-e593-422b-a450-a6825d7cdfa7_2086x2226.png 424w, https://substackcdn.com/image/fetch/$s_!y-i5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bc2d863-e593-422b-a450-a6825d7cdfa7_2086x2226.png 848w, https://substackcdn.com/image/fetch/$s_!y-i5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bc2d863-e593-422b-a450-a6825d7cdfa7_2086x2226.png 1272w, https://substackcdn.com/image/fetch/$s_!y-i5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bc2d863-e593-422b-a450-a6825d7cdfa7_2086x2226.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!y-i5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bc2d863-e593-422b-a450-a6825d7cdfa7_2086x2226.png" width="1456" height="1554" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5bc2d863-e593-422b-a450-a6825d7cdfa7_2086x2226.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1554,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:180441,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814425?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bc2d863-e593-422b-a450-a6825d7cdfa7_2086x2226.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!y-i5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bc2d863-e593-422b-a450-a6825d7cdfa7_2086x2226.png 424w, https://substackcdn.com/image/fetch/$s_!y-i5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bc2d863-e593-422b-a450-a6825d7cdfa7_2086x2226.png 848w, https://substackcdn.com/image/fetch/$s_!y-i5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bc2d863-e593-422b-a450-a6825d7cdfa7_2086x2226.png 1272w, https://substackcdn.com/image/fetch/$s_!y-i5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5bc2d863-e593-422b-a450-a6825d7cdfa7_2086x2226.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Each agent asked a different type of question. What does the data look like? How is it transformed? Is the system healthy? The full picture only emerged when their findings were combined.</p><p style="text-align: justify;">This architecture worked well in controlled demos. Then real users started using it, and the team discovered that building agents was only part of the challenge.</p><h2 style="text-align: justify;">Challenges In Production</h2><p style="text-align: justify;">Grab&#8217;s initial prototype performed well in controlled settings, but real-world usage exposed critical gaps. Complex questions, long conversations, and edge cases pushed the system in ways that demos never did.</p><p style="text-align: justify;">Here are four of the most instructive challenges they faced, along with the solutions they engineered.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ittf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31beb73a-3d93-40dc-81fe-e9fda6ab5bce_2608x3102.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ittf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31beb73a-3d93-40dc-81fe-e9fda6ab5bce_2608x3102.png 424w, https://substackcdn.com/image/fetch/$s_!ittf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31beb73a-3d93-40dc-81fe-e9fda6ab5bce_2608x3102.png 848w, https://substackcdn.com/image/fetch/$s_!ittf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31beb73a-3d93-40dc-81fe-e9fda6ab5bce_2608x3102.png 1272w, https://substackcdn.com/image/fetch/$s_!ittf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31beb73a-3d93-40dc-81fe-e9fda6ab5bce_2608x3102.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ittf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31beb73a-3d93-40dc-81fe-e9fda6ab5bce_2608x3102.png" width="1456" height="1732" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/31beb73a-3d93-40dc-81fe-e9fda6ab5bce_2608x3102.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1732,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:435047,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197814425?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31beb73a-3d93-40dc-81fe-e9fda6ab5bce_2608x3102.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ittf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31beb73a-3d93-40dc-81fe-e9fda6ab5bce_2608x3102.png 424w, https://substackcdn.com/image/fetch/$s_!ittf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31beb73a-3d93-40dc-81fe-e9fda6ab5bce_2608x3102.png 848w, https://substackcdn.com/image/fetch/$s_!ittf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31beb73a-3d93-40dc-81fe-e9fda6ab5bce_2608x3102.png 1272w, https://substackcdn.com/image/fetch/$s_!ittf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31beb73a-3d93-40dc-81fe-e9fda6ab5bce_2608x3102.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Let&#8217;s look at each in more detail:</p><h3 style="text-align: justify;">Context Overflow Across Agent Handoffs</h3><p style="text-align: justify;">In a multi-agent system, context accumulates fast. Every piece of information passed from one agent to the next adds tokens, and LLM performance degrades when context windows get overloaded.</p><p style="text-align: justify;">Grab built a multi-layered solution.</p><p style="text-align: justify;">They track every message&#8217;s token count in real time using tiktoken, an open-source tokenizer library. When token limits approach, earlier messages are automatically summarized while recent messages and critical context remain untouched to preserve accuracy.</p><p style="text-align: justify;">They also prune tool outputs before handoffs. Instead of passing full code files to the Code Search Agent, smaller LLM models extract only the relevant snippets and a short description. The orchestrator sits between agents, cleaning and compressing context at every handoff.</p><h3 style="text-align: justify;">Tool Bloat</h3><p style="text-align: justify;">The initial design gave agents access to over 30 tools, each with verbose descriptions structured like generic API documentation.</p><p style="text-align: justify;">Since tool definitions are part of the agent&#8217;s prompt, every inference call had to process all of that text. This degraded both speed and quality.</p><p style="text-align: justify;">The fix was aggressive simplification. Include only the portions of tool descriptions needed for decision-making, truncate verbose outputs, and streamline everything to be concise and actionable. This sounds simple, but it produced a substantial improvement in system responsiveness. The lesson is that tool design is an important engineering concern, and fewer well-designed tools outperform a large collection of generic ones.</p><h3 style="text-align: justify;">Risky Code Execution</h3><p style="text-align: justify;">AI agents with database access and code generation capabilities pose real risks. Without safeguards, they could access sensitive PII data, execute dangerous SQL operations, run expensive queries that scan entire tables, or generate breaking code changes.</p><p style="text-align: justify;">Grab built four layers of defense that work together so that any single layer&#8217;s blind spots are covered by the others.</p><ul><li><p style="text-align: justify;">The first layer is input classification. The Classifier detects PII requests and out-of-scope queries before any agent executes.</p></li><li><p style="text-align: justify;">The second layer is SQL validation. Every query is checked for PII column access, dangerous operations like DELETE or DROP, missing partition filters, and schema validity. Without these partition filters, a query might scan an entire massive table instead of just the relevant slice, which is both expensive and slow, and schema validity.</p></li><li><p style="text-align: justify;">The third layer is timeout protection, where strict execution limits on all database queries prevent runaway operations.</p></li><li><p style="text-align: justify;">The fourth layer is enhancement controls. The Enhancement Agent cannot commit to the main branches directly. All changes require human review, and everything runs in staging before production.</p></li></ul><h3 style="text-align: justify;">Earning User Trust</h3><p style="text-align: justify;">Even with safety layers, AI agents can hallucinate, misinterpret questions, or stumble on edge cases. If users lose confidence in the answers, the system fails regardless of its technical capabilities.</p><p style="text-align: justify;">Grab built a human review system where engineers can take five actions on any AI-generated response. They can approve it as-is with a verified footnote, reject it and log it for improvement, refine it by adding a prompt to regenerate the answer, re-route it to a specific agent with additional context, or annotate it with structured feedback for continuous improvement.</p><p style="text-align: justify;">They also made a key design evolution here.</p><p style="text-align: justify;">Initially, the system withheld all AI-generated responses until an engineer approved them. This was safe but slow, and it created a new bottleneck where questions sat unanswered during peak workload times.</p><p style="text-align: justify;">They redesigned the flow to post responses immediately with a clear, unreviewed label, allowing engineers to review and modify as needed. Users get fast answers, the transparency of the label sets appropriate expectations, and the review process still catches errors.</p><p style="text-align: justify;">Solving these challenges made the system reliable. But the team wanted something more, a system that gets smarter over time</p><h2 style="text-align: justify;">Closing the Loop</h2><p style="text-align: justify;">The annotations from human review were initially passive records. The team had a wealth of information about what worked and what failed, but they were missing a systematic way to learn from it.</p><p style="text-align: justify;">They transformed annotations into an active improvement engine through multiple mechanisms, which are as follows:</p><ul><li><p style="text-align: justify;">Random annotations get pulled to create test cases for offline evaluation, ensuring the system is tested against real-world failures rather than synthetic ones.</p></li><li><p style="text-align: justify;">Pattern analysis identifies systemic issues by asking questions such as:</p><ul><li><p style="text-align: justify;">Is the Classifier consistently routing to the wrong agents?</p></li><li><p style="text-align: justify;">Does a specific agent struggle with certain query types?</p></li><li><p style="text-align: justify;">Are particular table schemas confusing?</p></li></ul></li><li><p style="text-align: justify;">Quality metrics tracked over time detect regression. If the rejection rate suddenly spikes, something has changed that needs investigation.</p></li><li><p style="text-align: justify;">Targeted improvements use these insights to refine agent prompts, enhance guardrails, and add examples for query types that the system struggles with.</p></li></ul><p style="text-align: justify;">The impact was significant. The bots now autonomously handle the majority of standard user inquiries and a significant portion of enhancement requests. Resolution time dropped by an order of magnitude. The team reclaimed several full-time equivalents worth of engineering bandwidth, shifting hundreds of hours from reactive support to proactive roadmap delivery.</p><h2 style="text-align: justify;">Conclusion</h2><p style="text-align: justify;">Grab&#8217;s journey from overwhelmed data engineers to an AI-augmented team distills into a few key principles:</p><ul><li><p style="text-align: justify;">If the problems vary but the process of solving them stays consistent, it is a good opportunity to have automation.</p></li><li><p style="text-align: justify;">When building that automation, expect the majority of the effort to go into production hardening rather than the agents themselves.</p></li><li><p style="text-align: justify;">Apply different levels of autonomy based on the risk profile of the operation.</p></li><li><p style="text-align: justify;">Read-only investigations can run with light oversight, but anything that changes production data deserves human gates.</p></li><li><p style="text-align: justify;">Engineer the feedback loop deliberately, because without it, the system is frozen at the quality level of its first deployment. Every rejected response, every annotation, every pattern in the failure data is an opportunity to make the system smarter.</p></li></ul><p style="text-align: justify;">Grab&#8217;s own principles capture this well. The goal was never to replace engineers. It was to give them their time back.</p><p style="text-align: justify;"><strong>References:</strong></p><ul><li><p style="text-align: justify;"><a href="https://engineering.grab.com/from-firefighting-to-building">From firefighting to building: How AI Agents restored our team&#8217;s core productivity</a></p></li><li><p style="text-align: justify;"><a href="https://en.wikipedia.org/wiki/AI_agent">What is an AI Agent</a></p></li></ul>]]></content:encoded></item><item><title><![CDATA[EP215: The Anatomy of an AI Agent]]></title><description><![CDATA[An AI agent can be thought of as a simple While-loop.]]></description><link>https://blog.bytebytego.com/p/ep215-the-anatomy-of-an-ai-agent</link><guid isPermaLink="false">https://blog.bytebytego.com/p/ep215-the-anatomy-of-an-ai-agent</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Sat, 16 May 2026 15:31:01 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!lOfS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20aada1f-cc38-4c94-8778-eeaa7b63aceb_2484x3002.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><a href="https://go.bytebytego.com/GitLab_051626">Software Development is changing. And so is GitLab. Learn how. (Sponsored)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/GitLab_051626" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8-_G!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa493b7e3-2cb4-49fb-8c79-8ea2a8b41aec_2048x1075.png 424w, https://substackcdn.com/image/fetch/$s_!8-_G!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa493b7e3-2cb4-49fb-8c79-8ea2a8b41aec_2048x1075.png 848w, https://substackcdn.com/image/fetch/$s_!8-_G!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa493b7e3-2cb4-49fb-8c79-8ea2a8b41aec_2048x1075.png 1272w, https://substackcdn.com/image/fetch/$s_!8-_G!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa493b7e3-2cb4-49fb-8c79-8ea2a8b41aec_2048x1075.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8-_G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa493b7e3-2cb4-49fb-8c79-8ea2a8b41aec_2048x1075.png" width="1456" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a493b7e3-2cb4-49fb-8c79-8ea2a8b41aec_2048x1075.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:&quot;https://go.bytebytego.com/GitLab_051626&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!8-_G!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa493b7e3-2cb4-49fb-8c79-8ea2a8b41aec_2048x1075.png 424w, https://substackcdn.com/image/fetch/$s_!8-_G!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa493b7e3-2cb4-49fb-8c79-8ea2a8b41aec_2048x1075.png 848w, https://substackcdn.com/image/fetch/$s_!8-_G!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa493b7e3-2cb4-49fb-8c79-8ea2a8b41aec_2048x1075.png 1272w, https://substackcdn.com/image/fetch/$s_!8-_G!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa493b7e3-2cb4-49fb-8c79-8ea2a8b41aec_2048x1075.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>On June 10, GitLab Transcend streams live from London with  an agenda built for practitioners like you. You can expect an agenda that&#8217;s full of keyboard moments with live demos of Duo Agent Platform, agentic AI use cases from your peers, and The Developer Show hosted live by Senior Developer Advocate, Colleen Lake.</p><p>GitLab Transcend streams live from London on June 10 with regional replays for APAC and AMER on June 11.</p><p>Register today. It&#8217;s free to register and totally virtual. Come see intelligent orchestration, now with context.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/GitLab_051626&quot;,&quot;text&quot;:&quot;Stream the event live on 6/10&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://go.bytebytego.com/GitLab_051626"><span>Stream the event live on 6/10</span></a></p><div><hr></div><p>This week&#8217;s system design refresher:</p><ul><li><p>Prompt Injection, Clearly Explained (Youtube video)</p></li><li><p>The Anatomy of an AI Agent</p></li><li><p>REST vs GraphQL vs gRPC</p></li><li><p>If Claude Code is a burger...</p></li><li><p>git fetch vs git pull vs git pull &#8212;rebase</p></li></ul><div><hr></div><h2>Prompt Injection, Clearly Explained</h2><div id="youtube2-KDcayRssGbw" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;KDcayRssGbw&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/KDcayRssGbw?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><div><hr></div><h2>The Anatomy of an AI Agent</h2><p>An AI agent can be thought of as a simple While-loop. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lOfS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20aada1f-cc38-4c94-8778-eeaa7b63aceb_2484x3002.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lOfS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20aada1f-cc38-4c94-8778-eeaa7b63aceb_2484x3002.png 424w, https://substackcdn.com/image/fetch/$s_!lOfS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20aada1f-cc38-4c94-8778-eeaa7b63aceb_2484x3002.png 848w, https://substackcdn.com/image/fetch/$s_!lOfS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20aada1f-cc38-4c94-8778-eeaa7b63aceb_2484x3002.png 1272w, https://substackcdn.com/image/fetch/$s_!lOfS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20aada1f-cc38-4c94-8778-eeaa7b63aceb_2484x3002.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lOfS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20aada1f-cc38-4c94-8778-eeaa7b63aceb_2484x3002.png" width="1456" height="1760" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/20aada1f-cc38-4c94-8778-eeaa7b63aceb_2484x3002.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1760,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!lOfS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20aada1f-cc38-4c94-8778-eeaa7b63aceb_2484x3002.png 424w, https://substackcdn.com/image/fetch/$s_!lOfS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20aada1f-cc38-4c94-8778-eeaa7b63aceb_2484x3002.png 848w, https://substackcdn.com/image/fetch/$s_!lOfS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20aada1f-cc38-4c94-8778-eeaa7b63aceb_2484x3002.png 1272w, https://substackcdn.com/image/fetch/$s_!lOfS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F20aada1f-cc38-4c94-8778-eeaa7b63aceb_2484x3002.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>It uses an LLM to select an action, executes that action, evaluates the result, and repeats the process until the task is complete. Let&#8217;s take a closer look at each of these components:</p><ul><li><p>Brain: The LLM is the core. It reads the situation, thinks, and decides what to do next. The big shift from chatbot to agent: the model isn't writing text anymore, it's making choices.</p></li><li><p>Planning: Hard tasks need more than one step. Agents break them down using methods like Chain of Thought (think step by step), Tree of Thoughts (try options, pick the best), or <br>Reflexion (learn from mistakes and retry). Planning turns a fuzzy goal into clear actions.</p></li><li><p>Tools: An LLM without tools is a brain in a jar. Tools are functions the model can call, like web search, code execution, APIs, files, or browsers (often using the MCP standard). The model requests a tool, the system runs it, and the result comes back.</p></li><li><p>Memory: Without memory, every turn starts from zero. Short-term memory is the context window. Long-term memory lives in vector stores, files, and knowledge bases. When the window fills up, agents summarize old turns and carry the summary forward.</p></li><li><p>Loop: All four pieces work together in a cycle. The agent looks at the current state, decides what to do, uses a tool, sees the result, and repeats. It keeps going until it gives a final answer.</p></li><li><p>Guardrails: Not strictly anatomy, but important. Sandboxing, human checks, token limits, output validation, and scope limits keep autonomy from turning into expensive chaos. The more autonomy you give, the more these matter.</p></li></ul><p>Over to you: when you build an agent, which of these five takes the most work to get right?</p><div><hr></div><h2>REST vs GraphQL vs gRPC</h2><p>REST, GraphQL, and gRPC are three distinct approaches to designing APIs. Each offers a different trade-off between simplicity, performance, and flexibility.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eDv8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffafb183d-5c2f-4a6e-994e-ecba33663b11_2484x3002.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eDv8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffafb183d-5c2f-4a6e-994e-ecba33663b11_2484x3002.png 424w, https://substackcdn.com/image/fetch/$s_!eDv8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffafb183d-5c2f-4a6e-994e-ecba33663b11_2484x3002.png 848w, https://substackcdn.com/image/fetch/$s_!eDv8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffafb183d-5c2f-4a6e-994e-ecba33663b11_2484x3002.png 1272w, https://substackcdn.com/image/fetch/$s_!eDv8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffafb183d-5c2f-4a6e-994e-ecba33663b11_2484x3002.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eDv8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffafb183d-5c2f-4a6e-994e-ecba33663b11_2484x3002.png" width="1456" height="1760" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fafb183d-5c2f-4a6e-994e-ecba33663b11_2484x3002.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1760,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!eDv8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffafb183d-5c2f-4a6e-994e-ecba33663b11_2484x3002.png 424w, https://substackcdn.com/image/fetch/$s_!eDv8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffafb183d-5c2f-4a6e-994e-ecba33663b11_2484x3002.png 848w, https://substackcdn.com/image/fetch/$s_!eDv8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffafb183d-5c2f-4a6e-994e-ecba33663b11_2484x3002.png 1272w, https://substackcdn.com/image/fetch/$s_!eDv8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffafb183d-5c2f-4a6e-994e-ecba33663b11_2484x3002.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol><li><p>REST: Each URL represents a resource, and you use standard HTTP verbs (GET, POST, PUT, DELETE) to act on it. Simple and universal, but it often requires multiple requests to assemble related data.<br><br>Trade-offs: Easy to learn, cache-friendly, and works with any HTTP client, but tends to over-fetch or under-fetch data, leading to chatty clients and version drift as endpoints proliferate.</p></li><li><p>GraphQL: The client sends a query describing exactly the data shape it needs, and the server returns precisely that data through a single endpoint.<br><br>Trade-offs: Eliminates over-fetching and lets frontends evolve independently, but shifts complexity to the server (resolvers, N+1 queries), complicates caching, and makes rate-limiting and query-cost analysis harder.</p></li><li><p>gRPC: Services communicate via strongly-typed method calls over HTTP/2 using compact binary (protobuf) encoding, making it ideal for fast, low-latency service-to-service communication with built-in streaming support.<br><br>Trade-offs: Excellent performance and strict contracts via protobuf schemas, but the binary format isn't human-readable, browser support requires a proxy (gRPC-Web), and debugging is harder than with plain JSON over HTTP.</p></li></ol><p>Rule of thumb: REST for public APIs and broad compatibility, GraphQL when clients need flexible, aggregated views, and gRPC for internal microservices where latency and throughput matter most.</p><div><hr></div><h2>If Claude Code is a burger...</h2><p>Before each model call, Claude Code assembles a context window from 9 distinct sources. </p><p>Think of it as a burger, each layer adds something different.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!N0Ju!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae110097-f828-4b76-ba3c-de2a774a4ea7_2484x3002.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!N0Ju!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae110097-f828-4b76-ba3c-de2a774a4ea7_2484x3002.jpeg 424w, https://substackcdn.com/image/fetch/$s_!N0Ju!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae110097-f828-4b76-ba3c-de2a774a4ea7_2484x3002.jpeg 848w, https://substackcdn.com/image/fetch/$s_!N0Ju!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae110097-f828-4b76-ba3c-de2a774a4ea7_2484x3002.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!N0Ju!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae110097-f828-4b76-ba3c-de2a774a4ea7_2484x3002.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!N0Ju!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae110097-f828-4b76-ba3c-de2a774a4ea7_2484x3002.jpeg" width="1456" height="1760" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ae110097-f828-4b76-ba3c-de2a774a4ea7_2484x3002.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1760,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!N0Ju!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae110097-f828-4b76-ba3c-de2a774a4ea7_2484x3002.jpeg 424w, https://substackcdn.com/image/fetch/$s_!N0Ju!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae110097-f828-4b76-ba3c-de2a774a4ea7_2484x3002.jpeg 848w, https://substackcdn.com/image/fetch/$s_!N0Ju!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae110097-f828-4b76-ba3c-de2a774a4ea7_2484x3002.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!N0Ju!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae110097-f828-4b76-ba3c-de2a774a4ea7_2484x3002.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol><li><p>System Prompt: Defines Claude's role, behavior, and tone. This sets the foundation.</p></li><li><p>Environment Info: Git status, branch info, and current date. Pulled in via getSystemContext()</p></li><li><p>CLAUDE. md: A four-level instruction hierarchy: managed &#8594; user &#8594; project &#8594; local. Plain-text Markdown, so users can read, edit, and version-control everything the model sees.</p></li><li><p>Auto Memory: Contextually relevant memory entries prefetched asynchronously. An LLM scans memory-file headers and surfaces up to 5 relevant files on demand.</p></li><li><p>Path-scoped Rules: Conditional rules that load lazily when the agent reads files</p></li><li><p>Tool Metadata: Skill descriptions, MCP tool names, and deferred tool definitions.</p></li><li><p>Conversation History: Carried forward across iterations. </p></li><li><p>Tool Results: File reads, command outputs, and subagent summaries.</p></li><li><p>Compact Summaries: When history grows too long, older segments are replaced by model-generated summaries.</p></li></ol><p>The whole design treats context as a scarce resource.</p><p>Over to you: Which of these 9 layers do you tune the most when working with Claude Code?</p><div><hr></div><h2>git fetch vs git pull vs git pull &#8212;rebase</h2><p>Most Git mistakes do not come from a bad commit. Your branch is behind, you have local commits, and now you need to bring in upstream changes. That is when the difference between git fetch, git pull, and git pull &#8212;rebase matters.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!71ii!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68416644-193b-4b64-8e50-2e36a950b890_2484x3002.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!71ii!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68416644-193b-4b64-8e50-2e36a950b890_2484x3002.png 424w, https://substackcdn.com/image/fetch/$s_!71ii!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68416644-193b-4b64-8e50-2e36a950b890_2484x3002.png 848w, https://substackcdn.com/image/fetch/$s_!71ii!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68416644-193b-4b64-8e50-2e36a950b890_2484x3002.png 1272w, https://substackcdn.com/image/fetch/$s_!71ii!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68416644-193b-4b64-8e50-2e36a950b890_2484x3002.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!71ii!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68416644-193b-4b64-8e50-2e36a950b890_2484x3002.png" width="1456" height="1760" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/68416644-193b-4b64-8e50-2e36a950b890_2484x3002.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1760,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!71ii!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68416644-193b-4b64-8e50-2e36a950b890_2484x3002.png 424w, https://substackcdn.com/image/fetch/$s_!71ii!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68416644-193b-4b64-8e50-2e36a950b890_2484x3002.png 848w, https://substackcdn.com/image/fetch/$s_!71ii!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68416644-193b-4b64-8e50-2e36a950b890_2484x3002.png 1272w, https://substackcdn.com/image/fetch/$s_!71ii!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68416644-193b-4b64-8e50-2e36a950b890_2484x3002.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>git fetch downloads remote changes and updates origin/main. Your local main does not move. Nothing in your working directory changes. That makes fetch the safest option when you want to inspect what changed upstream before integrating anything.</p><p>git pull goes one step further. It fetches first and then merges the upstream branch into your current branch. Your local commits stay intact, and Git adds a merge commit to connect the two histories.</p><p>git pull &#8212;rebase is the clean one. It starts with a fetch, but instead of merging, it reapplies your local commits on top of the updated upstream branch. The result is a linear history with no merge commit.</p><p>Fetch when you just want to see what's on the remote before deciding anything. Pull when you're on your own branch and don't mind merge commits showing up in the log. Rebase when you're cleaning up a feature branch before opening a PR and want the history to read cleanly.</p><p>Over to you: How do you handle a feature branch that's a few days old while main has moved 10 commits ahead?</p>]]></content:encoded></item><item><title><![CDATA[LAST CALL FOR ENROLLMENT: Become an AI Engineer - Cohort 6]]></title><description><![CDATA[Our 6th cohort of Becoming an AI Engineer starts tomorrow, Saturday, May 16. This is a live, cohort-based course created in collaboration with best-selling author Ali Aminian and published by ByteByteGo.]]></description><link>https://blog.bytebytego.com/p/last-call-for-enrollment-become-an-a88</link><guid isPermaLink="false">https://blog.bytebytego.com/p/last-call-for-enrollment-become-an-a88</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Fri, 15 May 2026 15:02:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!kaYA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Our 6th cohort of Becoming an AI Engineer <strong>starts tomorrow, Saturday, May 16</strong>. This is a live, cohort-based course created in collaboration with best-selling author Ali Aminian and published by ByteByteGo.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/substack-bbai&quot;,&quot;text&quot;:&quot;Check it out Here&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://go.bytebytego.com/substack-bbai"><span>Check it out Here</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/substack-bbai" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kaYA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 424w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 848w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 1272w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kaYA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png" width="1456" height="1801" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1801,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:741069,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://go.bytebytego.com/substack-bbai&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196812138?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!kaYA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 424w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 848w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 1272w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Here&#8217;s what makes this cohort special:</p><ul><li><p>Learn by doing: Build real world AI applications, not just by watching videos.</p></li><li><p>Structured, systematic learning path: Follow a carefully designed curriculum that takes you step by step, from fundamentals to advanced topics.</p></li><li><p>Live feedback and mentorship: Get direct feedback from instructors and peers.</p></li><li><p>Community driven: Learning alone is hard. Learning with a community is easy!</p></li></ul><p>We are focused on skill building, not just theory or passive learning. Our goal is for every participant to walk away with a strong foundation for building AI systems.</p><p>If you want to start learning AI from scratch, this is the perfect platform for you to begin.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/substack-bbai&quot;,&quot;text&quot;:&quot;Check it out Here&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://go.bytebytego.com/substack-bbai"><span>Check it out Here</span></a></p>]]></content:encoded></item><item><title><![CDATA[A Guide To Event-Driven Architectural Patterns]]></title><description><![CDATA[Distributed systems are built out of services that need to communicate, and the simplest way to do that is for one service to call another directly and wait for a response.]]></description><link>https://blog.bytebytego.com/p/a-guide-to-event-driven-architectural</link><guid isPermaLink="false">https://blog.bytebytego.com/p/a-guide-to-event-driven-architectural</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Thu, 14 May 2026 15:32:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!-BJY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ee764c-c720-46a6-9c66-dbb53a5e646e_2250x2624.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p style="text-align: justify;">Distributed systems are built out of services that need to communicate, and the simplest way to do that is for one service to call another directly and wait for a response. This pattern works well for small systems and predictable workloads.</p><p style="text-align: justify;">However, as systems grow, it tends to produce tight coupling between services, fragile failure behavior, and bottlenecks at the slowest component in any chain of calls.</p><p style="text-align: justify;">Event-driven architecture is an alternative communication model where services publish events when something meaningful happens, and other services react to those events on their own time. The patterns related to this architecture are the established techniques for handling the new problems that the model introduces.</p><p style="text-align: justify;">In this article, we will start with the basics of how event-driven systems are structured, look at why synchronous communication starts to break down at scale, and then walk through six patterns that solve specific problems EDA introduces.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-BJY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ee764c-c720-46a6-9c66-dbb53a5e646e_2250x2624.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-BJY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ee764c-c720-46a6-9c66-dbb53a5e646e_2250x2624.png 424w, https://substackcdn.com/image/fetch/$s_!-BJY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ee764c-c720-46a6-9c66-dbb53a5e646e_2250x2624.png 848w, https://substackcdn.com/image/fetch/$s_!-BJY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ee764c-c720-46a6-9c66-dbb53a5e646e_2250x2624.png 1272w, https://substackcdn.com/image/fetch/$s_!-BJY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ee764c-c720-46a6-9c66-dbb53a5e646e_2250x2624.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-BJY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ee764c-c720-46a6-9c66-dbb53a5e646e_2250x2624.png" width="1456" height="1698" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/57ee764c-c720-46a6-9c66-dbb53a5e646e_2250x2624.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1698,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:505502,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/197472195?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ee764c-c720-46a6-9c66-dbb53a5e646e_2250x2624.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-BJY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ee764c-c720-46a6-9c66-dbb53a5e646e_2250x2624.png 424w, https://substackcdn.com/image/fetch/$s_!-BJY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ee764c-c720-46a6-9c66-dbb53a5e646e_2250x2624.png 848w, https://substackcdn.com/image/fetch/$s_!-BJY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ee764c-c720-46a6-9c66-dbb53a5e646e_2250x2624.png 1272w, https://substackcdn.com/image/fetch/$s_!-BJY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F57ee764c-c720-46a6-9c66-dbb53a5e646e_2250x2624.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Foundations of Event-Driven Architecture</h2>
      <p>
          <a href="https://blog.bytebytego.com/p/a-guide-to-event-driven-architectural">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[High Performance Rate Limiting at Databricks]]></title><description><![CDATA[In this article, we look at how Databricks implemented rate limiting at scale, how they shrank the critical path, and the accuracy tradeoff that shrinking usually requires.]]></description><link>https://blog.bytebytego.com/p/high-performance-rate-limiting-at</link><guid isPermaLink="false">https://blog.bytebytego.com/p/high-performance-rate-limiting-at</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Wed, 13 May 2026 15:30:35 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!kYmO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5152ea9-bc74-4453-998b-b66c39d84b64_2928x1252.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><a href="https://go.bytebytego.com/ScyllaDB_051326">ScyllaDB Founders Share What Real-Time AI Requires from the Database (Sponsored)</a></h2><h4><em>AI is pushing databases to their limits; learn what it takes to stay ahead</em></h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/ScyllaDB_051326" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TZ-R!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd30179ab-1c41-435d-a451-0856e9e1610f_1600x840.jpeg 424w, https://substackcdn.com/image/fetch/$s_!TZ-R!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd30179ab-1c41-435d-a451-0856e9e1610f_1600x840.jpeg 848w, https://substackcdn.com/image/fetch/$s_!TZ-R!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd30179ab-1c41-435d-a451-0856e9e1610f_1600x840.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!TZ-R!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd30179ab-1c41-435d-a451-0856e9e1610f_1600x840.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TZ-R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd30179ab-1c41-435d-a451-0856e9e1610f_1600x840.jpeg" width="1456" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d30179ab-1c41-435d-a451-0856e9e1610f_1600x840.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:376344,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:&quot;https://go.bytebytego.com/ScyllaDB_051326&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196936594?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd30179ab-1c41-435d-a451-0856e9e1610f_1600x840.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TZ-R!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd30179ab-1c41-435d-a451-0856e9e1610f_1600x840.jpeg 424w, https://substackcdn.com/image/fetch/$s_!TZ-R!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd30179ab-1c41-435d-a451-0856e9e1610f_1600x840.jpeg 848w, https://substackcdn.com/image/fetch/$s_!TZ-R!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd30179ab-1c41-435d-a451-0856e9e1610f_1600x840.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!TZ-R!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd30179ab-1c41-435d-a451-0856e9e1610f_1600x840.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>AI workloads are exposing the limits of what most databases were designed to handle. Databases will need to process petabytes of data, millions of writes per second, and data types like vectors &#8211; all while delivering consistent sub-millisecond P99 latency.</p><p>Join ScyllaDB co-founders Dor Laor (CEO) and Avi Kivity (CTO) to explore what real-time AI workloads actually require, and what it takes to stay ahead.</p><p>You will learn:</p><ul><li><p>How AI workloads are shifting in real-world applications</p></li><li><p>The specific pressures these new patterns place on databases</p></li><li><p>What architectural features help teams meet AI&#8217;s demands</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/ScyllaDB_051326&quot;,&quot;text&quot;:&quot;Register for Free&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://go.bytebytego.com/ScyllaDB_051326"><span>Register for Free</span></a></p><div><hr></div><p style="text-align: justify;">In early 2023, the Databricks rate limiter ran on a simple architecture. An Envoy ingress gateway made calls to a Ratelimit Service, which in turn queried a single Redis instance. The setup handled the traffic it was designed for, and the per-second nature of rate limiting meant the counts could stay transient without any durability guarantee.</p><p style="text-align: justify;">Then, the real-time model serving was launched. A single customer could now generate orders of magnitude more traffic than the service was built for, and three specific cracks appeared.</p><ul><li><p style="text-align: justify;">Tail latency climbed sharply under load, worsened by two network hops and a P99 of 10 to 20 milliseconds between services in one cloud provider.</p></li><li><p style="text-align: justify;">Adding machines and bolting on caches stopped helping past a certain point.</p></li><li><p style="text-align: justify;">The single Redis instance also represented a single point of failure that the team could no longer tolerate.</p></li></ul><p style="text-align: justify;">The team redesigned the service, and the rebuild merits attention because the most interesting part is what they chose to give up.</p><p style="text-align: justify;">Strict accuracy is expensive at scale, and Databricks traded it for a faster critical path, a horizontally scalable counter, and a rate limiter that answers as if the decision has already been made by the time the client checks.</p><p style="text-align: justify;">In this article, we look at how Databricks implemented rate limiting at scale, how they shrank the critical path, and the accuracy tradeoff that shrinking usually requires.</p><p style="text-align: justify;"><em>Disclaimer: This post is based on publicly shared details from the Databricks Engineering Team. Please comment if you notice any inaccuracies.</em></p><h2 style="text-align: justify;">A Counting Problem</h2><p style="text-align: justify;">Strip away the framing, and rate limiting reduces to a counting problem. Each request arrives, the system locates the right counter, compares it against a threshold, and either allows or rejects the request. The design question is where that counter is stored and how quickly it works.</p><p style="text-align: justify;">In the old Databricks architecture, the counter was stored in Redis. See the diagram below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LUAU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17acdec1-451f-4eec-92f3-feceee3f686c_1922x1372.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LUAU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17acdec1-451f-4eec-92f3-feceee3f686c_1922x1372.png 424w, https://substackcdn.com/image/fetch/$s_!LUAU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17acdec1-451f-4eec-92f3-feceee3f686c_1922x1372.png 848w, https://substackcdn.com/image/fetch/$s_!LUAU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17acdec1-451f-4eec-92f3-feceee3f686c_1922x1372.png 1272w, https://substackcdn.com/image/fetch/$s_!LUAU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17acdec1-451f-4eec-92f3-feceee3f686c_1922x1372.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LUAU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17acdec1-451f-4eec-92f3-feceee3f686c_1922x1372.png" width="1456" height="1039" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/17acdec1-451f-4eec-92f3-feceee3f686c_1922x1372.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1039,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:88353,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196936594?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17acdec1-451f-4eec-92f3-feceee3f686c_1922x1372.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LUAU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17acdec1-451f-4eec-92f3-feceee3f686c_1922x1372.png 424w, https://substackcdn.com/image/fetch/$s_!LUAU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17acdec1-451f-4eec-92f3-feceee3f686c_1922x1372.png 848w, https://substackcdn.com/image/fetch/$s_!LUAU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17acdec1-451f-4eec-92f3-feceee3f686c_1922x1372.png 1272w, https://substackcdn.com/image/fetch/$s_!LUAU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F17acdec1-451f-4eec-92f3-feceee3f686c_1922x1372.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">A request flowed through Envoy, hit the Ratelimit Service, and triggered a call to Redis. That meant two network hops on the critical path of every request. In a cloud provider where P99 network latency sat between 10 and 20 milliseconds, those hops dominated the rate limit decision time. A check that should have cost microseconds was costing tens of milliseconds.</p><p style="text-align: justify;">See the diagram below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kYmO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5152ea9-bc74-4453-998b-b66c39d84b64_2928x1252.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kYmO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5152ea9-bc74-4453-998b-b66c39d84b64_2928x1252.png 424w, https://substackcdn.com/image/fetch/$s_!kYmO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5152ea9-bc74-4453-998b-b66c39d84b64_2928x1252.png 848w, https://substackcdn.com/image/fetch/$s_!kYmO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5152ea9-bc74-4453-998b-b66c39d84b64_2928x1252.png 1272w, https://substackcdn.com/image/fetch/$s_!kYmO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5152ea9-bc74-4453-998b-b66c39d84b64_2928x1252.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kYmO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5152ea9-bc74-4453-998b-b66c39d84b64_2928x1252.png" width="1456" height="623" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c5152ea9-bc74-4453-998b-b66c39d84b64_2928x1252.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:623,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:117878,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196936594?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5152ea9-bc74-4453-998b-b66c39d84b64_2928x1252.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kYmO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5152ea9-bc74-4453-998b-b66c39d84b64_2928x1252.png 424w, https://substackcdn.com/image/fetch/$s_!kYmO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5152ea9-bc74-4453-998b-b66c39d84b64_2928x1252.png 848w, https://substackcdn.com/image/fetch/$s_!kYmO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5152ea9-bc74-4453-998b-b66c39d84b64_2928x1252.png 1272w, https://substackcdn.com/image/fetch/$s_!kYmO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc5152ea9-bc74-4453-998b-b66c39d84b64_2928x1252.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The team had already tried to work around this. Envoy can be configured with consistent hashing so that requests with the same key land on the same Ratelimit Service instance, which lets that instance keep a local count. The approach helped, but it hit three walls.</p><ul><li><p style="text-align: justify;">Non-Envoy services could not participate in the scheme, which fragmented the rate limit view.</p></li><li><p style="text-align: justify;">When the service cluster scaled up or restarted, the hash assignments churned, which forced regular syncs back to Redis.</p></li><li><p style="text-align: justify;">Lastly, consistent hashing can be prone to hotspotting, where one very popular key saturates a single machine while its neighbours sit idle. The only way to push through those hotspots was to over-provision the entire cluster.</p></li></ul><p style="text-align: justify;">This is where scaling stops being additive. Adding machines stopped moving the latency numbers, and adding more caching introduced more inconsistency. The architecture itself was the ceiling, and the team had to change it.</p><h2 style="text-align: justify;">Moving the Count In-memory</h2><p style="text-align: justify;">Rate limiting is transient. A per-second count exists only as long as that second is current, and the moment it rolls over, the old value becomes irrelevant.</p><p style="text-align: justify;">That property opens a door. If a count only needs to live for a second, durable storage is more than the problem requires. The count can live in memory on the server that owns it, and losing that server during a restart costs almost nothing.</p><p style="text-align: justify;">The challenge is that a single server cannot hold counts for every rate limit key across the fleet. The service needs a way to partition keys across servers, and a way for any client to quickly find the server that owns a given key.</p><p style="text-align: justify;">That is the problem Dicer solves at Databricks. For the purposes of this discussion, Dicer can be treated as a black box. Dicer is a routing layer that lets a service keep state in memory while remaining horizontally scalable and fault-tolerant. Clients ask Dicer which server owns a given key, and that server confirms it is the authoritative owner before handling the request.</p><p style="text-align: justify;">See the diagram below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZqWf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3511e8e7-397c-49b7-936e-4e6809d0bbe4_2102x1604.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZqWf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3511e8e7-397c-49b7-936e-4e6809d0bbe4_2102x1604.png 424w, https://substackcdn.com/image/fetch/$s_!ZqWf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3511e8e7-397c-49b7-936e-4e6809d0bbe4_2102x1604.png 848w, https://substackcdn.com/image/fetch/$s_!ZqWf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3511e8e7-397c-49b7-936e-4e6809d0bbe4_2102x1604.png 1272w, https://substackcdn.com/image/fetch/$s_!ZqWf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3511e8e7-397c-49b7-936e-4e6809d0bbe4_2102x1604.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZqWf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3511e8e7-397c-49b7-936e-4e6809d0bbe4_2102x1604.png" width="1456" height="1111" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3511e8e7-397c-49b7-936e-4e6809d0bbe4_2102x1604.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1111,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:130442,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196936594?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3511e8e7-397c-49b7-936e-4e6809d0bbe4_2102x1604.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZqWf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3511e8e7-397c-49b7-936e-4e6809d0bbe4_2102x1604.png 424w, https://substackcdn.com/image/fetch/$s_!ZqWf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3511e8e7-397c-49b7-936e-4e6809d0bbe4_2102x1604.png 848w, https://substackcdn.com/image/fetch/$s_!ZqWf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3511e8e7-397c-49b7-936e-4e6809d0bbe4_2102x1604.png 1272w, https://substackcdn.com/image/fetch/$s_!ZqWf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3511e8e7-397c-49b7-936e-4e6809d0bbe4_2102x1604.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">With Dicer in place, the Ratelimit Service could move every counter in-memory. The network hop to Redis disappeared. Server-side tail latency came down sharply, and the team could scale horizontally by adding replicas to Dicer&#8217;s assignment pool. The single point of failure went away as well, because each replica became the authoritative store for its own slice of keys. Restarts and scale events redistributed ownership without any external coordination, and the churn that had forced Redis syncs in the old consistent hashing setup stopped having an impact.</p><p style="text-align: justify;">This solved one problem and exposed another.</p><p style="text-align: justify;">The server side was now fast, but the client side still made a synchronous call across the network for every single request. A rate limit check that had been waiting on Redis was now waiting on the Rate Limit Service. The P99 came down, but the shape of the problem remained. The critical path still had a round trip on it.</p><h2 style="text-align: justify;">Optimistic Rate Limiting</h2><p style="text-align: justify;">This is where the team made its most consequential move.</p><p style="text-align: justify;">Millions of client requests per second were still translating into millions of synchronous calls to the Ratelimit Service. Even with the server answering in memory, that represented significant network traffic, significant server capacity, and significant client-side waiting. The team asked a harder question.</p><p style="text-align: justify;">Does every request truly need to wait for a rate limit decision before proceeding?</p><p style="text-align: justify;">They considered three alternatives:</p><ul><li><p style="text-align: justify;">The first was prefetching tokens on the client, where the client pulls a block of capacity and answers rate limit checks locally.</p></li><li><p style="text-align: justify;">The second was batching requests on the client and waiting for a response before releasing them.</p></li><li><p style="text-align: justify;">The third was sampling, where only a fraction of requests get checked.</p></li></ul><p style="text-align: justify;">Each had problems. Prefetching carries messy edge cases during startup, expiry, and token exhaustion. Batching adds delay and memory pressure. Sampling works for high-QPS limits but falls apart when the limit itself is small.</p><p style="text-align: justify;">What the team built is called batch-reporting, and it rests on two ideas:</p><ul><li><p style="text-align: justify;">The first is that clients make no remote calls on the rate limit path.</p></li><li><p style="text-align: justify;">The second is optimistic rate limiting, where the default is to allow the request and reject only when the client already has a reason to reject from an earlier report.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JQZ9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c99a901-9c2c-4ab3-9cf1-99985c074c90_2102x1604.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JQZ9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c99a901-9c2c-4ab3-9cf1-99985c074c90_2102x1604.png 424w, https://substackcdn.com/image/fetch/$s_!JQZ9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c99a901-9c2c-4ab3-9cf1-99985c074c90_2102x1604.png 848w, https://substackcdn.com/image/fetch/$s_!JQZ9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c99a901-9c2c-4ab3-9cf1-99985c074c90_2102x1604.png 1272w, https://substackcdn.com/image/fetch/$s_!JQZ9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c99a901-9c2c-4ab3-9cf1-99985c074c90_2102x1604.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JQZ9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c99a901-9c2c-4ab3-9cf1-99985c074c90_2102x1604.png" width="1456" height="1111" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4c99a901-9c2c-4ab3-9cf1-99985c074c90_2102x1604.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1111,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:144889,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196936594?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c99a901-9c2c-4ab3-9cf1-99985c074c90_2102x1604.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JQZ9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c99a901-9c2c-4ab3-9cf1-99985c074c90_2102x1604.png 424w, https://substackcdn.com/image/fetch/$s_!JQZ9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c99a901-9c2c-4ab3-9cf1-99985c074c90_2102x1604.png 848w, https://substackcdn.com/image/fetch/$s_!JQZ9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c99a901-9c2c-4ab3-9cf1-99985c074c90_2102x1604.png 1272w, https://substackcdn.com/image/fetch/$s_!JQZ9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4c99a901-9c2c-4ab3-9cf1-99985c074c90_2102x1604.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The client counts how many requests it let through and how many it rejected, grouped by rate limit key. Every 100 milliseconds or so, a background thread packages those counts and reports them to the Ratelimit Service. The server responds with instructions telling the client which keys should be rejected, until which timestamp, and at what rejection rate.</p><p style="text-align: justify;">The diagram below shows the three architectures side by side:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q_uQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6fa6d2c-6b90-4019-8b28-e476dbc14e4c_2182x1836.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q_uQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6fa6d2c-6b90-4019-8b28-e476dbc14e4c_2182x1836.png 424w, https://substackcdn.com/image/fetch/$s_!q_uQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6fa6d2c-6b90-4019-8b28-e476dbc14e4c_2182x1836.png 848w, https://substackcdn.com/image/fetch/$s_!q_uQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6fa6d2c-6b90-4019-8b28-e476dbc14e4c_2182x1836.png 1272w, https://substackcdn.com/image/fetch/$s_!q_uQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6fa6d2c-6b90-4019-8b28-e476dbc14e4c_2182x1836.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q_uQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6fa6d2c-6b90-4019-8b28-e476dbc14e4c_2182x1836.png" width="1456" height="1225" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b6fa6d2c-6b90-4019-8b28-e476dbc14e4c_2182x1836.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1225,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:146815,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196936594?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6fa6d2c-6b90-4019-8b28-e476dbc14e4c_2182x1836.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q_uQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6fa6d2c-6b90-4019-8b28-e476dbc14e4c_2182x1836.png 424w, https://substackcdn.com/image/fetch/$s_!q_uQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6fa6d2c-6b90-4019-8b28-e476dbc14e4c_2182x1836.png 848w, https://substackcdn.com/image/fetch/$s_!q_uQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6fa6d2c-6b90-4019-8b28-e476dbc14e4c_2182x1836.png 1272w, https://substackcdn.com/image/fetch/$s_!q_uQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6fa6d2c-6b90-4019-8b28-e476dbc14e4c_2182x1836.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The impact was substantial:</p><ul><li><p style="text-align: justify;">Tail latency on rate limit calls fell by roughly a factor of ten, because the calls were effectively free for the client.</p></li><li><p style="text-align: justify;">Spiky inbound traffic turned into constant outbound reports because the reporting cadence is fixed regardless of how bursty the underlying traffic becomes.</p></li><li><p style="text-align: justify;">Server-side load became predictable for the first time.</p></li></ul><p style="text-align: justify;">The inversion merits emphasis. The rate limiter used to be asked before each decision. Now it is told after. That inversion sits at the core of the redesign, and the tradeoff is explicit. Databricks accepts that some requests over the limit will slip through between reports, and their backends are built to tolerate that overshoot.</p><h2 style="text-align: justify;">Bounding The Overshoot</h2><p style="text-align: justify;">Batch-reporting introduced a problem of its own.</p><p style="text-align: justify;">Between the moment a client starts exceeding a limit and the moment the server tells it to reject, traffic can leak through. A hundred milliseconds of overshoot at high QPS amounts to a lot of requests. The team wanted guarantees that kept overshoot within roughly 5 percent of the policy, and reaching that target required three-layered fixes.</p><p style="text-align: justify;">The first was a rejection rate returned by the server. The idea is to use the past to predict the near future. If the last second&#8217;s traffic exceeded the policy by some amount, the formula rejectionRate equals (estimatedQps minus rateLimitPolicy) divided by estimatedQps tells the client what fraction of upcoming requests to drop. This assumes that the next second&#8217;s traffic resembles the last second&#8217;s, which often holds true to help.</p><p style="text-align: justify;">The second was a client-side local rate limiter as a defense in depth. When traffic spikes so obviously that the batch cycle has no chance of catching up, the client starts rejecting locally based on its own counts. This catches the extreme cases immediately rather than waiting for a round trip.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!x0wa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be67dbb-4fba-43a5-ad7e-b3eb24a7d94c_2564x1946.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!x0wa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be67dbb-4fba-43a5-ad7e-b3eb24a7d94c_2564x1946.png 424w, https://substackcdn.com/image/fetch/$s_!x0wa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be67dbb-4fba-43a5-ad7e-b3eb24a7d94c_2564x1946.png 848w, https://substackcdn.com/image/fetch/$s_!x0wa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be67dbb-4fba-43a5-ad7e-b3eb24a7d94c_2564x1946.png 1272w, https://substackcdn.com/image/fetch/$s_!x0wa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be67dbb-4fba-43a5-ad7e-b3eb24a7d94c_2564x1946.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!x0wa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be67dbb-4fba-43a5-ad7e-b3eb24a7d94c_2564x1946.png" width="1456" height="1105" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9be67dbb-4fba-43a5-ad7e-b3eb24a7d94c_2564x1946.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1105,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:154945,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196936594?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be67dbb-4fba-43a5-ad7e-b3eb24a7d94c_2564x1946.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!x0wa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be67dbb-4fba-43a5-ad7e-b3eb24a7d94c_2564x1946.png 424w, https://substackcdn.com/image/fetch/$s_!x0wa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be67dbb-4fba-43a5-ad7e-b3eb24a7d94c_2564x1946.png 848w, https://substackcdn.com/image/fetch/$s_!x0wa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be67dbb-4fba-43a5-ad7e-b3eb24a7d94c_2564x1946.png 1272w, https://substackcdn.com/image/fetch/$s_!x0wa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9be67dbb-4fba-43a5-ad7e-b3eb24a7d94c_2564x1946.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The third was the algorithm itself. Once autosharding lets the service hold counts in memory, the token bucket becomes feasible. The token bucket has a useful property that a fixed window lacks. Fixed window resets to zero at the end of every interval, so a customer can blast traffic right at a window boundary and technically stay within the policy while sending double the intended rate during the crossover. The token bucket continuously fills and drains, and it can go negative. When a customer sends too many requests, the bucket remembers and stays empty until the refill catches up. The reset problem disappears. Token bucket also approximates a sliding window when configured without extra burst capacity, which produces a stricter shape than a fixed window for most limits.</p><p style="text-align: justify;">This is where the algorithm choice was gated by the storage choice. Token bucket needs compare-and-set style logic on every increment, which was slow in Redis. In-memory, the same operation is close to free. Once the token bucket was viable, the earlier rejection rate mechanism became unnecessary, and the team eventually converted every rate limit in the system to a token bucket.</p><h2 style="text-align: justify;">Three Coupled Decisions</h2><p style="text-align: justify;">The Databricks story resolves into three decisions that depend on each other:</p><ul><li><p style="text-align: justify;">The first is the algorithm, which determines how the counter behaves at the boundaries of time intervals. Fixed window, sliding window, and token bucket each produce different behaviour in that regard.</p></li><li><p style="text-align: justify;">The second is where the state lives, whether in a shared external store like Redis, in an in-memory counter on a single server, or in an in-memory counter sharded across a cluster of servers through some routing layer.</p></li><li><p style="text-align: justify;">The third is the sync model, which can either require every request to wait for a synchronous decision or allow clients to make local decisions and reconcile through asynchronous reports.</p></li></ul><p style="text-align: justify;">The old Databricks architecture sat at one corner of that space, combining a fixed window, shared Redis, and synchronous per-request checks. The new architecture sits at a different corner, combining token bucket, sharded in-memory storage, and asynchronous batch reports.</p><p style="text-align: justify;">The dependency chain is worth understanding. Token bucket needs cheap compare-and-set semantics, which rules out Redis at the QPS Databricks sees, which forces in-memory state. In-memory state across many counters forces sharding, because one server cannot hold everything. Sharding with authoritative per-key ownership is what enables batch-reporting, because each shard can act as the source of truth for its keys without coordinating with peers.</p><p style="text-align: justify;">These constraints explain the order of the rollout. Sharded in-memory came first, batch-reporting followed on top of it, and token bucket replaced the algorithm once the state architecture could support it.</p><h2 style="text-align: justify;">Conclusion</h2><p style="text-align: justify;">The end result is a rate limiter that is faster, more resilient, and more scalable than what came before, at the cost of strict accuracy. That tradeoff is the foundation of the design.</p><p style="text-align: justify;">A system that has to enforce limits exactly, because each request over the limit costs real money or violates a contract, would have to pick a different architecture. Databricks could afford this one because their backends tolerate roughly 5 percent overshoot.</p><p style="text-align: justify;">A few smaller details from the rollout are worth noting. The team built a localhost sidecar next to the Envoy ingress to host the batch-reporting logic, because Envoy is third-party code they could not change directly. Before in-memory counting was ready, a Lua script on Redis batched writes together to keep batch-reporting latency manageable during the migration.</p><p style="text-align: justify;">The rebuild reframes what rate limiting is as a system problem.</p><p style="text-align: justify;">The algorithm tends to get the attention, but the storage and sync model determine whether the algorithm can run at scale. Distributed counting is a single design problem with three coupled aspects rather than three independent ones.</p><p style="text-align: justify;"><strong>References:</strong></p><ul><li><p style="text-align: justify;"><a href="https://www.databricks.com/blog/high-performance-ratelimiting-databricks">High Performance Ratelimiting at Databricks</a></p></li></ul>]]></content:encoded></item><item><title><![CDATA[How Figma Upgraded Data Pipeline from Multi-Day Latency to Real-Time]]></title><description><![CDATA[In this article, we will learn what happened as Figma grew and how its engineering team handled the growth in terms of the data pipeline issues.]]></description><link>https://blog.bytebytego.com/p/how-figma-upgraded-data-pipeline</link><guid isPermaLink="false">https://blog.bytebytego.com/p/how-figma-upgraded-data-pipeline</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Tue, 12 May 2026 15:31:03 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!AWAB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48641814-0011-44b5-81e5-d031311c05d0_1938x1246.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><a href="https://go.bytebytego.com/Coderabbit_051226">Harness engineering for agentic code review (Sponsored)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/Coderabbit_051226" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!COML!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093afca2-a742-49e0-a152-22a85a661568_3240x3780.png 424w, https://substackcdn.com/image/fetch/$s_!COML!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093afca2-a742-49e0-a152-22a85a661568_3240x3780.png 848w, https://substackcdn.com/image/fetch/$s_!COML!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093afca2-a742-49e0-a152-22a85a661568_3240x3780.png 1272w, https://substackcdn.com/image/fetch/$s_!COML!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093afca2-a742-49e0-a152-22a85a661568_3240x3780.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!COML!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093afca2-a742-49e0-a152-22a85a661568_3240x3780.png" width="1456" height="1699" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/093afca2-a742-49e0-a152-22a85a661568_3240x3780.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1699,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:618537,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://go.bytebytego.com/Coderabbit_051226&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/193057411?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093afca2-a742-49e0-a152-22a85a661568_3240x3780.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!COML!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093afca2-a742-49e0-a152-22a85a661568_3240x3780.png 424w, https://substackcdn.com/image/fetch/$s_!COML!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093afca2-a742-49e0-a152-22a85a661568_3240x3780.png 848w, https://substackcdn.com/image/fetch/$s_!COML!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093afca2-a742-49e0-a152-22a85a661568_3240x3780.png 1272w, https://substackcdn.com/image/fetch/$s_!COML!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F093afca2-a742-49e0-a152-22a85a661568_3240x3780.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Underneath every review sits a purpose-built, independent context engine. It&#8217;s the layer that decides what the agent actually sees before a single token of generation happens.</p><p>Purpose-built because code review demands a different context than chat or autocomplete: placing the relevant context fragments assembled for each review.</p><p>The engine assembles inputs across four planes:</p><ul><li><p>Sandbox. Cloned repo, dependency analysis, multi-repo context and linters/SAST (ESLint, Semgrep) running on the change.</p></li><li><p>Review instructions. Your coding guidelines, AGENTS.md, path, and AST-scoped rules, tone, and learnings from past reviews.</p></li><li><p>Integrations. MCP tools, issue trackers (Jira, Linear), CI/CD failures, and web search.</p></li><li><p>LLMs. Routing across OpenAI and Anthropic.</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/Coderabbit_051226&quot;,&quot;text&quot;:&quot;Try For Free&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://go.bytebytego.com/Coderabbit_051226"><span>Try For Free</span></a></p><div><hr></div><p style="text-align: justify;">In 2020, Figma&#8217;s data synchronization architecture was about five lines of logic. A cron job ran once a day, queried every row from a database table, dumped it into S3, and loaded it into Snowflake.</p><p style="text-align: justify;">It was straightforward, easy to reason about, and it worked.</p><p style="text-align: justify;">Three years later, that same simplicity was costing Figma millions of dollars a year and leaving their analytics team looking at data that was already days old by the time they could query it.</p><p style="text-align: justify;">For reference, Figma is a collaborative design platform where teams create, prototype, and iterate on user interfaces together in real time. If you&#8217;ve used a modern app or website, there&#8217;s a high chance the screens were designed in Figma or that Figma was part of the workflow.</p><p style="text-align: justify;">Since its early days, the product has expanded well beyond its core design tool. FigJam added collaborative whiteboarding in 2021. Dev Mode launched in 2023 to bridge the gap between designers and developers. Figma Make brought AI-powered app prototyping into the mix. The company also localized for the Brazilian, Japanese, Spanish, and Korean markets.</p><p style="text-align: justify;">All of that growth meant an explosion in the volume and complexity of data flowing through Figma&#8217;s systems every day.</p><p style="text-align: justify;">In this article, we will learn what happened as Figma grew and how its engineering team handled the growth in terms of the data pipeline issues.</p><p style="text-align: justify;"><em>Disclaimer: This post is based on publicly shared details from the Figma Engineering Team. Please comment if you notice any inaccuracies.</em></p><h2>When SELECT * Becomes Your Bottleneck</h2><p style="text-align: justify;">Figma&#8217;s original data pipeline did what&#8217;s called a full sync. Every run copied the entire contents of a database table, regardless of how much had actually changed since the last run. If a table had ten million rows and only fifty changed that day, the pipeline still copied all ten million. When tables are small, this is fast and cheap.</p><p style="text-align: justify;">To start with, Figma&#8217;s production databases were hosted on Amazon RDS PostgreSQL and served live user traffic. Every time someone opens a file, saves a change, or loads a project, those databases handle the request. Running heavy analytical queries on these same databases, things like computing company-wide KPIs or analyzing usage trends across millions of users, would compete with live traffic and slow down the product. So like most companies at this scale, Figma maintains a separate analytics warehouse in Snowflake, a database built specifically for these kinds of large, complex queries. The catch is that data has to get from one to the other. That transfer is the synchronization pipeline.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AWAB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48641814-0011-44b5-81e5-d031311c05d0_1938x1246.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AWAB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48641814-0011-44b5-81e5-d031311c05d0_1938x1246.png 424w, https://substackcdn.com/image/fetch/$s_!AWAB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48641814-0011-44b5-81e5-d031311c05d0_1938x1246.png 848w, https://substackcdn.com/image/fetch/$s_!AWAB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48641814-0011-44b5-81e5-d031311c05d0_1938x1246.png 1272w, https://substackcdn.com/image/fetch/$s_!AWAB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48641814-0011-44b5-81e5-d031311c05d0_1938x1246.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AWAB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48641814-0011-44b5-81e5-d031311c05d0_1938x1246.png" width="1456" height="936" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/48641814-0011-44b5-81e5-d031311c05d0_1938x1246.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:936,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:118020,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/193057411?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48641814-0011-44b5-81e5-d031311c05d0_1938x1246.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AWAB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48641814-0011-44b5-81e5-d031311c05d0_1938x1246.png 424w, https://substackcdn.com/image/fetch/$s_!AWAB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48641814-0011-44b5-81e5-d031311c05d0_1938x1246.png 848w, https://substackcdn.com/image/fetch/$s_!AWAB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48641814-0011-44b5-81e5-d031311c05d0_1938x1246.png 1272w, https://substackcdn.com/image/fetch/$s_!AWAB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F48641814-0011-44b5-81e5-d031311c05d0_1938x1246.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">But Figma&#8217;s tables didn&#8217;t stay small.</p><p style="text-align: justify;">As mentioned, between 2021 and 2025, they launched FigJam, Dev Mode, Figma Make, and expanded localization to serve the Brazilian, Japanese, Spanish, and Korean markets. The user base grew rapidly, and so did the data.</p><p style="text-align: justify;">By 2023, daily synchronization tasks were taking around six hours to complete. The largest tables took several days. To make things worse, the pipeline required dedicated database replicas just to handle the export load without affecting production traffic. Those replicas alone cost millions of dollars annually.</p><p style="text-align: justify;">Figma evaluated three options to handle this:</p><ul><li><p style="text-align: justify;">They could keep the existing system, but sync delays and replica costs made that untenable.</p></li><li><p style="text-align: justify;">They could add parallelism to speed up the full copies, but this was a band-aid that wouldn&#8217;t scale as tables continued to grow.</p></li><li><p style="text-align: justify;">Or they could overhaul the pipeline entirely.</p></li></ul><p style="text-align: justify;">They chose the overhaul, committing to incremental synchronization. Instead of copying entire tables every run, they&#8217;d capture only what changed and apply those changes to the destination. The concept is simple, but the execution is not.</p><h2 style="text-align: justify;">Incremental Synchronization</h2><p style="text-align: justify;">Incremental synchronization flips the model. Rather than asking &#8220;what does the whole table look like right now?&#8221; it asks &#8220;what changed since last time?&#8221; Only the inserts, updates, and deletes since the last sync get transferred and applied. For a table with ten million rows where fifty changed, you&#8217;re now moving fifty rows instead of ten million.</p><p style="text-align: justify;">The mechanism that makes this possible is called Change Data Capture, or CDC. Every database keeps an internal log of every write operation, known as the write-ahead log, for its own crash-recovery purposes. CDC reads that log and converts it into a stream of change events. This does not add overhead to the database, and we are piggybacking on bookkeeping that the database is already doing.</p><p style="text-align: justify;">The diagram below shows how CDC works on a high-level:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aa_c!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5070bf1f-f0ea-414f-9d35-1276cefc933d_2086x2572.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aa_c!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5070bf1f-f0ea-414f-9d35-1276cefc933d_2086x2572.png 424w, https://substackcdn.com/image/fetch/$s_!aa_c!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5070bf1f-f0ea-414f-9d35-1276cefc933d_2086x2572.png 848w, https://substackcdn.com/image/fetch/$s_!aa_c!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5070bf1f-f0ea-414f-9d35-1276cefc933d_2086x2572.png 1272w, https://substackcdn.com/image/fetch/$s_!aa_c!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5070bf1f-f0ea-414f-9d35-1276cefc933d_2086x2572.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aa_c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5070bf1f-f0ea-414f-9d35-1276cefc933d_2086x2572.png" width="1456" height="1795" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5070bf1f-f0ea-414f-9d35-1276cefc933d_2086x2572.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1795,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:231950,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/193057411?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5070bf1f-f0ea-414f-9d35-1276cefc933d_2086x2572.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aa_c!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5070bf1f-f0ea-414f-9d35-1276cefc933d_2086x2572.png 424w, https://substackcdn.com/image/fetch/$s_!aa_c!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5070bf1f-f0ea-414f-9d35-1276cefc933d_2086x2572.png 848w, https://substackcdn.com/image/fetch/$s_!aa_c!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5070bf1f-f0ea-414f-9d35-1276cefc933d_2086x2572.png 1272w, https://substackcdn.com/image/fetch/$s_!aa_c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5070bf1f-f0ea-414f-9d35-1276cefc933d_2086x2572.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Those change events need somewhere to go. Figma uses Kafka, a distributed streaming platform that acts as a buffer between the production database and Snowflake.</p><p style="text-align: justify;">As CDC captures changes, it publishes them to Kafka topics, one topic per table. Snowflake then consumes from those topics at its own pace. This decoupling ensures that the production database doesn&#8217;t need to know or care whether Snowflake is online, busy, or behind. It just writes events to Kafka, and Kafka holds onto them until the consumer is ready. If Snowflake goes down for maintenance, no data is lost. The events queue up in Kafka and get processed once Snowflake comes back.</p><p style="text-align: justify;">One point to note, however, is that the stream only captures changes from the moment you start listening. It doesn&#8217;t contain the full history of the table. So on day one, the destination database is empty, and the change stream only knows about changes happening right now. There is a need for a starting point.</p><p style="text-align: justify;">That starting point is a snapshot. In this approach, we take a full copy of the table at a specific moment in time, then start applying changes from before that moment forward. Here&#8217;s why the timing matters. For example, Ssy Figma kicks off a snapshot export at 2:00 AM, and the export takes two hours to complete. During those two hours, users are still active. Records are being created, updated, and deleted. The snapshot finishes at 4:00 AM, but it only reflects the state of the table as of 2:00 AM. If the change stream starts capturing events at 4:00 AM, every change between 2:00 and 4:00 AM is lost. The destination table will be missing two hours of data, with no error to flag the gap. To avoid this, Figma ensures the Kafka CDC stream&#8217;s start offset precedes the snapshot timestamp. That overlap means some events will be duplicates of what&#8217;s already in the snapshot, but duplicates can be handled during the merge step. Missing data cannot.</p><p style="text-align: justify;">Figma also had to decide whether to buy an off-the-shelf solution or build its own setup. They evaluated vendor options seriously and found three problems:</p><ul><li><p style="text-align: justify;">Generic CDC tools couldn&#8217;t leverage Amazon RDS-specific APIs, like the ability to export snapshots directly to S3 without maintaining a separate database replica.</p></li><li><p style="text-align: justify;">Vendor pricing at Figma&#8217;s scale came out to five to ten times more than an in-house build.</p></li><li><p style="text-align: justify;">The tools they evaluated couldn&#8217;t reliably handle Figma&#8217;s data volume, which was still growing.</p></li></ul><p style="text-align: justify;">Therefore, they assembled their pipeline from lower-level components:</p><ul><li><p style="text-align: justify;">Amazon RDS handles snapshot exports to S3.</p></li><li><p style="text-align: justify;">Kafka streams the CDC events.</p></li><li><p style="text-align: justify;">Snowflake stored procedures perform the incremental merge, in other words, applying the stream of changes to bring the destination tables up to date.</p></li><li><p style="text-align: justify;">Merge jobs run on a configurable schedule, defaulting to every three hours.</p></li></ul><p style="text-align: justify;">See the diagram below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aeFj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e21def-74c8-411a-a7a1-c4b35ed6b178_2404x1334.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aeFj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e21def-74c8-411a-a7a1-c4b35ed6b178_2404x1334.png 424w, https://substackcdn.com/image/fetch/$s_!aeFj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e21def-74c8-411a-a7a1-c4b35ed6b178_2404x1334.png 848w, https://substackcdn.com/image/fetch/$s_!aeFj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e21def-74c8-411a-a7a1-c4b35ed6b178_2404x1334.png 1272w, https://substackcdn.com/image/fetch/$s_!aeFj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e21def-74c8-411a-a7a1-c4b35ed6b178_2404x1334.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aeFj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e21def-74c8-411a-a7a1-c4b35ed6b178_2404x1334.png" width="1456" height="808" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85e21def-74c8-411a-a7a1-c4b35ed6b178_2404x1334.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:808,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:151537,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/193057411?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e21def-74c8-411a-a7a1-c4b35ed6b178_2404x1334.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aeFj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e21def-74c8-411a-a7a1-c4b35ed6b178_2404x1334.png 424w, https://substackcdn.com/image/fetch/$s_!aeFj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e21def-74c8-411a-a7a1-c4b35ed6b178_2404x1334.png 848w, https://substackcdn.com/image/fetch/$s_!aeFj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e21def-74c8-411a-a7a1-c4b35ed6b178_2404x1334.png 1272w, https://substackcdn.com/image/fetch/$s_!aeFj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85e21def-74c8-411a-a7a1-c4b35ed6b178_2404x1334.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">That three-hour default is a deliberate design choice, not a limitation. More frequent merges mean fresher data but higher Snowflake compute costs. Figma lets teams override the default where it matters. Their billing pipeline, for example, runs on 30-minute merge cycles. Each team pays only for the freshness they actually need.</p><h2 style="text-align: justify;">Trust But Verify</h2><p style="text-align: justify;">Building the pipeline is half the job. The other half is knowing whether it&#8217;s actually working correctly.</p><p style="text-align: justify;">Data pipelines can fail in ways multiple ways that don&#8217;t produce errors. For example:</p><ul><li><p style="text-align: justify;">A partial failure during a snapshot export</p></li><li><p style="text-align: justify;">A misconfigured CDC connector</p></li><li><p style="text-align: justify;">An unexpected data format from the source.</p></li></ul><p style="text-align: justify;">These issues don&#8217;t crash the pipeline. They just produce wrong data. And wrong data in an analytics warehouse leads to wrong KPIs, wrong business decisions, and a slow erosion of trust in the entire data platform.</p><p style="text-align: justify;">Figma&#8217;s answer to this scenario is quite rigorous. They built a dedicated validation workflow that clones the live base table, runs the entire bootstrap process independently into a temporary schema, aligns both copies to the same point in time using CDC data, and then compares them cell by cell. This runs weekly for every table in the pipeline. Most teams settle for row-count checks or sampling. Figma treats its analytical warehouse with the same correctness guarantees you&#8217;d expect from a production database.</p><p style="text-align: justify;">The reason this approach works is independence. If a bug exists somewhere in the main pipeline, say a CDC connector silently drops certain types of update events, any validation that reuses the same pipeline path would inherit the same bug. The corrupted data would match the corrupted check, and everything would look fine. By bootstrapping a completely separate copy through an independent process and comparing the two, Figma guarantees that an error in one path can&#8217;t silently pass the other&#8217;s checks.</p><p style="text-align: justify;">Figma also built a zero-downtime re-bootstrap capability by versioning all bootstrap artifacts except the final user-facing view. When schemas evolve, or a full re-bootstrap is needed, the new version is built in parallel and promoted via an atomic view update. Live queries are never disrupted.</p><p style="text-align: justify;">The other piece that holds it all together is automation. Figma structured its automation into two tiers:</p><ul><li><p style="text-align: justify;">First-level automations handle the actual execution. You give them a table name, and they run the bootstrap or validation and alert if something goes wrong.</p></li><li><p style="text-align: justify;">Second-level automations sit above and decide when to trigger the first tier. A controller workflow checks every few hours for new tables that need onboarding. A dispatcher workflow kicks off validation for each table weekly.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cubB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cae3d9f-2766-4de2-b45b-397ae875d137_2944x2020.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cubB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cae3d9f-2766-4de2-b45b-397ae875d137_2944x2020.png 424w, https://substackcdn.com/image/fetch/$s_!cubB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cae3d9f-2766-4de2-b45b-397ae875d137_2944x2020.png 848w, https://substackcdn.com/image/fetch/$s_!cubB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cae3d9f-2766-4de2-b45b-397ae875d137_2944x2020.png 1272w, https://substackcdn.com/image/fetch/$s_!cubB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cae3d9f-2766-4de2-b45b-397ae875d137_2944x2020.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cubB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cae3d9f-2766-4de2-b45b-397ae875d137_2944x2020.png" width="1456" height="999" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4cae3d9f-2766-4de2-b45b-397ae875d137_2944x2020.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:999,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:238200,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/193057411?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cae3d9f-2766-4de2-b45b-397ae875d137_2944x2020.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cubB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cae3d9f-2766-4de2-b45b-397ae875d137_2944x2020.png 424w, https://substackcdn.com/image/fetch/$s_!cubB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cae3d9f-2766-4de2-b45b-397ae875d137_2944x2020.png 848w, https://substackcdn.com/image/fetch/$s_!cubB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cae3d9f-2766-4de2-b45b-397ae875d137_2944x2020.png 1272w, https://substackcdn.com/image/fetch/$s_!cubB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4cae3d9f-2766-4de2-b45b-397ae875d137_2944x2020.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The system is largely self-operating, and developers get involved only when alerts fire.</p><p style="text-align: justify;">The payoff came early. One week into testing in their staging environment, the automated re-bootstrap routine caught a severe failure mode. Had it reached production, it would have triggered a site-wide outage lasting at least twenty minutes. The old system, a daily cron job with no automated validation, could never have caught this.</p><h2 style="text-align: justify;">Conclusion</h2><p style="text-align: justify;">The numbers tell the story quite clearly:</p><ul><li><p style="text-align: justify;">Data freshness went from 30-plus hours to under three hours, with the option to configure it down to minutes.</p></li><li><p style="text-align: justify;">The pipeline handles tables over ten times larger than the old system could manage, with consistent and predictable performance.</p></li><li><p style="text-align: justify;">Eliminating the dedicated export replicas produced multimillion-dollar annual savings.</p></li><li><p style="text-align: justify;">Operations have seen zero major incidents during and after launch.</p></li></ul><p style="text-align: justify;">Beyond raw performance, the rebuild unlocked new capabilities.</p><p style="text-align: justify;">For example, a sync-on-demand CLI tool lets developers trigger immediate synchronization outside the regular schedule. CDC data is now exposed to end users, so developers can query the full change history of any entity and not just its current state. During incident response, this means questions like &#8220;show me every change to this user&#8217;s record in the last 48 hours&#8221; get answered in minutes.</p><p style="text-align: justify;">However, this project took a significant investment of time, effort, and resources.</p><p style="text-align: justify;">The new system is much more complex than a daily cron job. Figma compensated with aggressive automation and validation, but that&#8217;s additional complexity layered on top. Ultimately, this tradeoff was worth it at Figma&#8217;s scale.</p><p style="text-align: justify;"><strong>References:</strong></p><ul><li><p><a href="https://www.figma.com/blog/figmas-data-pipeline-upgrade/">From multi-day latency to near real-time insights: Figma&#8217;s Data Pipeline Upgrade</a></p></li><li><p><a href="https://en.wikipedia.org/wiki/Change_data_capture">Change Data Capture</a></p></li></ul>]]></content:encoded></item><item><title><![CDATA[How Pinterest Built a Production MCP Ecosystem]]></title><description><![CDATA[In this article, we look at how Pinterest designed that ecosystem and what they had to get right beyond the protocol itself.]]></description><link>https://blog.bytebytego.com/p/how-pinterest-built-a-production</link><guid isPermaLink="false">https://blog.bytebytego.com/p/how-pinterest-built-a-production</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Mon, 11 May 2026 15:31:17 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!qn5M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0acd8c0-dcbb-4462-914c-207889b0bd28_2576x1520.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><a href="https://go.bytebytego.com/WorkOS_051126Headline">Agents need context. Ship the integrations that give it to them. (Sponsored)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/WorkOS_051126CTA" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eP8c!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208f5b10-4166-4384-97ca-0b6979c095e3_1200x620.png 424w, https://substackcdn.com/image/fetch/$s_!eP8c!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208f5b10-4166-4384-97ca-0b6979c095e3_1200x620.png 848w, https://substackcdn.com/image/fetch/$s_!eP8c!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208f5b10-4166-4384-97ca-0b6979c095e3_1200x620.png 1272w, https://substackcdn.com/image/fetch/$s_!eP8c!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208f5b10-4166-4384-97ca-0b6979c095e3_1200x620.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eP8c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208f5b10-4166-4384-97ca-0b6979c095e3_1200x620.png" width="1200" height="620" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/208f5b10-4166-4384-97ca-0b6979c095e3_1200x620.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:620,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:143695,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://go.bytebytego.com/WorkOS_051126CTA&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196933670?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208f5b10-4166-4384-97ca-0b6979c095e3_1200x620.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eP8c!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208f5b10-4166-4384-97ca-0b6979c095e3_1200x620.png 424w, https://substackcdn.com/image/fetch/$s_!eP8c!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208f5b10-4166-4384-97ca-0b6979c095e3_1200x620.png 848w, https://substackcdn.com/image/fetch/$s_!eP8c!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208f5b10-4166-4384-97ca-0b6979c095e3_1200x620.png 1272w, https://substackcdn.com/image/fetch/$s_!eP8c!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F208f5b10-4166-4384-97ca-0b6979c095e3_1200x620.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">The context that actually matters isn't in your database. It's in the tools your users live in every day. Multi-stage agents stall the moment they hit a step they can't see. And every missing integration is a different OAuth flow, a different token lifecycle, weeks of plumbing before the agent reads a single record.<br><br><a href="https://go.bytebytego.com/WorkOS_051126Pipes">WorkOS Pipes</a> connects your agent to the tools your users live in. Pre-built connectors for GitHub, Slack, Salesforce, Google Drive, and more. Pipes handles OAuth, token refresh, and credential storage. You call the real provider API with a fresh token, every time. Your agent pulls context at every step, for as long as the task runs.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/WorkOS_051126CTA&quot;,&quot;text&quot;:&quot;Give your agent context &#8594;&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://go.bytebytego.com/WorkOS_051126CTA"><span>Give your agent context &#8594;</span></a></p><div><hr></div><p style="text-align: justify;">Engineers at Pinterest work across a sprawling set of internal systems every day. They query data through Presto, debug batch jobs in Spark, manage workflows in Airflow, search internal documentation, and track bugs in ticketing platforms.</p><p style="text-align: justify;">When Pinterest started building AI agents, they wanted those agents to do more than answer questions. They wanted agents that could reach into these systems directly, pulling logs, investigating bug tickets, querying databases, and proposing fixes, all within the surfaces engineers already use.</p><p style="text-align: justify;">The challenge was driven by standard maths. If you have five AI-powered surfaces (an internal chat app, IDE plugins, chatbots, CLI agents, and other autonomous agents) and ten internal tools, you&#8217;d need fifty bespoke integrations without a shared protocol. In other words, every new surface or tool multiplies the work.</p><p style="text-align: justify;">The Model Context Protocol (MCP) promised to collapse that multiplication into addition. Build one MCP client per surface and one MCP server per tool, and they all speak the same language.</p><p style="text-align: justify;">Pinterest adopted MCP as the foundation for this vision. However, implementing the protocol turned out to be the easy part. The real engineering effort went into everything around it, such as a central registry, a two-layer auth system, a unified deployment pipeline, and observability baked in from day one.</p><p style="text-align: justify;">In this article, we look at how Pinterest designed that ecosystem and what they had to get right beyond the protocol itself.</p><p style="text-align: justify;"><em>Disclaimer: This post is based on publicly shared details from the Pinterest Engineering Team. Please comment if you notice any inaccuracies.</em></p><h2 style="text-align: justify;">What is MCP</h2><p style="text-align: justify;">Model Context Protocol (MCP) is an open-source standard that gives large language models a unified way to talk to external tools and data sources.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qn5M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0acd8c0-dcbb-4462-914c-207889b0bd28_2576x1520.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qn5M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0acd8c0-dcbb-4462-914c-207889b0bd28_2576x1520.png 424w, https://substackcdn.com/image/fetch/$s_!qn5M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0acd8c0-dcbb-4462-914c-207889b0bd28_2576x1520.png 848w, https://substackcdn.com/image/fetch/$s_!qn5M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0acd8c0-dcbb-4462-914c-207889b0bd28_2576x1520.png 1272w, https://substackcdn.com/image/fetch/$s_!qn5M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0acd8c0-dcbb-4462-914c-207889b0bd28_2576x1520.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qn5M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0acd8c0-dcbb-4462-914c-207889b0bd28_2576x1520.png" width="1456" height="859" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c0acd8c0-dcbb-4462-914c-207889b0bd28_2576x1520.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:859,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:350061,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196933670?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0acd8c0-dcbb-4462-914c-207889b0bd28_2576x1520.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qn5M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0acd8c0-dcbb-4462-914c-207889b0bd28_2576x1520.png 424w, https://substackcdn.com/image/fetch/$s_!qn5M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0acd8c0-dcbb-4462-914c-207889b0bd28_2576x1520.png 848w, https://substackcdn.com/image/fetch/$s_!qn5M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0acd8c0-dcbb-4462-914c-207889b0bd28_2576x1520.png 1272w, https://substackcdn.com/image/fetch/$s_!qn5M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0acd8c0-dcbb-4462-914c-207889b0bd28_2576x1520.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Instead of writing custom glue code between every AI application and every tool it needs to access, MCP defines a shared client-server protocol. An AI surface acts as the client, an MCP server wraps a tool or data source, and they communicate using a standardized format for discovering tools, invoking them, and returning structured results.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!n9Ei!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F333c5263-12c0-4e4d-9b13-8d19f9e97055_2474x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!n9Ei!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F333c5263-12c0-4e4d-9b13-8d19f9e97055_2474x1600.png 424w, https://substackcdn.com/image/fetch/$s_!n9Ei!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F333c5263-12c0-4e4d-9b13-8d19f9e97055_2474x1600.png 848w, https://substackcdn.com/image/fetch/$s_!n9Ei!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F333c5263-12c0-4e4d-9b13-8d19f9e97055_2474x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!n9Ei!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F333c5263-12c0-4e4d-9b13-8d19f9e97055_2474x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!n9Ei!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F333c5263-12c0-4e4d-9b13-8d19f9e97055_2474x1600.png" width="1456" height="942" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/333c5263-12c0-4e4d-9b13-8d19f9e97055_2474x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:942,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:131780,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196933670?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F333c5263-12c0-4e4d-9b13-8d19f9e97055_2474x1600.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!n9Ei!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F333c5263-12c0-4e4d-9b13-8d19f9e97055_2474x1600.png 424w, https://substackcdn.com/image/fetch/$s_!n9Ei!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F333c5263-12c0-4e4d-9b13-8d19f9e97055_2474x1600.png 848w, https://substackcdn.com/image/fetch/$s_!n9Ei!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F333c5263-12c0-4e4d-9b13-8d19f9e97055_2474x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!n9Ei!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F333c5263-12c0-4e4d-9b13-8d19f9e97055_2474x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Before MCP, connecting AI surfaces to internal tools was an N x M problem. Five surfaces times ten tools equals fifty custom integrations to build and maintain. MCP turns that into an N+M problem. You build five clients and ten servers, and any client can talk to any server. That is fifteen pieces of work instead of fifty, and the gap widens as you add more surfaces or tools.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AUtX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0d5cf78-b1ae-46f3-91d5-e310ad8fa76b_2998x1678.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AUtX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0d5cf78-b1ae-46f3-91d5-e310ad8fa76b_2998x1678.png 424w, https://substackcdn.com/image/fetch/$s_!AUtX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0d5cf78-b1ae-46f3-91d5-e310ad8fa76b_2998x1678.png 848w, https://substackcdn.com/image/fetch/$s_!AUtX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0d5cf78-b1ae-46f3-91d5-e310ad8fa76b_2998x1678.png 1272w, https://substackcdn.com/image/fetch/$s_!AUtX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0d5cf78-b1ae-46f3-91d5-e310ad8fa76b_2998x1678.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AUtX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0d5cf78-b1ae-46f3-91d5-e310ad8fa76b_2998x1678.png" width="1456" height="815" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d0d5cf78-b1ae-46f3-91d5-e310ad8fa76b_2998x1678.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:815,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:260922,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196933670?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0d5cf78-b1ae-46f3-91d5-e310ad8fa76b_2998x1678.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AUtX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0d5cf78-b1ae-46f3-91d5-e310ad8fa76b_2998x1678.png 424w, https://substackcdn.com/image/fetch/$s_!AUtX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0d5cf78-b1ae-46f3-91d5-e310ad8fa76b_2998x1678.png 848w, https://substackcdn.com/image/fetch/$s_!AUtX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0d5cf78-b1ae-46f3-91d5-e310ad8fa76b_2998x1678.png 1272w, https://substackcdn.com/image/fetch/$s_!AUtX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd0d5cf78-b1ae-46f3-91d5-e310ad8fa76b_2998x1678.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">But MCP only defines the communication protocol. It does not handle authentication, authorization, deployment, service discovery, or governance.</p><p style="text-align: justify;">Those are the problems Pinterest had to solve on its own. In other words, the MCP spec provides the grammar, and Pinterest had to build the entire school system around it.</p><h2 style="text-align: justify;">Pinterest&#8217;s Three Architectural Bets</h2><p style="text-align: justify;">When Pinterest decided to adopt MCP, three early decisions shaped the entire ecosystem. Each involved a genuine tradeoff, and understanding those tradeoffs helps us make sense of why the architecture looks the way it does.</p><p style="text-align: justify;">See the diagram below that shows the overall architecture:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4mQ3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcdfc21a-411f-4eea-ab98-31984a2dbb80_3336x2010.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4mQ3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcdfc21a-411f-4eea-ab98-31984a2dbb80_3336x2010.png 424w, https://substackcdn.com/image/fetch/$s_!4mQ3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcdfc21a-411f-4eea-ab98-31984a2dbb80_3336x2010.png 848w, https://substackcdn.com/image/fetch/$s_!4mQ3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcdfc21a-411f-4eea-ab98-31984a2dbb80_3336x2010.png 1272w, https://substackcdn.com/image/fetch/$s_!4mQ3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcdfc21a-411f-4eea-ab98-31984a2dbb80_3336x2010.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4mQ3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcdfc21a-411f-4eea-ab98-31984a2dbb80_3336x2010.png" width="1456" height="877" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bcdfc21a-411f-4eea-ab98-31984a2dbb80_3336x2010.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:877,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:273441,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196933670?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcdfc21a-411f-4eea-ab98-31984a2dbb80_3336x2010.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4mQ3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcdfc21a-411f-4eea-ab98-31984a2dbb80_3336x2010.png 424w, https://substackcdn.com/image/fetch/$s_!4mQ3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcdfc21a-411f-4eea-ab98-31984a2dbb80_3336x2010.png 848w, https://substackcdn.com/image/fetch/$s_!4mQ3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcdfc21a-411f-4eea-ab98-31984a2dbb80_3336x2010.png 1272w, https://substackcdn.com/image/fetch/$s_!4mQ3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbcdfc21a-411f-4eea-ab98-31984a2dbb80_3336x2010.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3 style="text-align: justify;">Bet 1: Cloud-hosted servers, not local ones.</h3><p style="text-align: justify;">MCP supports local servers that run on a developer&#8217;s laptop and communicate over standard input/output. Many individual developers use MCP this way with tools like Claude or Cursor.</p><p style="text-align: justify;">Pinterest went the opposite direction.</p><p style="text-align: justify;">They explicitly optimized for internal cloud-hosted MCP servers, where their routing and security infrastructure could be applied consistently. Local servers are still allowed for experimentation, but the so-called paved path at Pinterest is to write a server, deploy it to their cloud compute environment, and register it in their central catalog. Every tool call becomes a network request, which adds latency compared to a local server.</p><p style="text-align: justify;">However, centralizing servers in the cloud meant that Pinterest could apply consistent authentication, authorization, logging, and monitoring across every server without relying on individual developers to configure those things correctly on their own machines.</p><h3 style="text-align: justify;">Bet 2: Many small servers, not one giant one</h3><p style="text-align: justify;">Pinterest debated building a single monolithic MCP server that exposed every tool versus building multiple domain-specific servers. They chose the latter.</p><p style="text-align: justify;">For example, the Presto MCP server handles data queries. The Spark MCP server handles job debugging. The Knowledge MCP server handles documentation and institutional Q&amp;A. Each server owns a small, coherent set of tools.</p><p style="text-align: justify;">Two forces drove this decision.</p><ul><li><p style="text-align: justify;">First, different servers need different access controls. The Presto server touches sensitive business data and requires strict group-based gating. A documentation server is lower risk and can be more broadly accessible. Bundling them into one server would force a single access policy across tools with very different sensitivity levels.</p></li><li><p style="text-align: justify;">Second, every tool description consumes tokens in the AI model&#8217;s context window, which is the limited amount of text the model can process in a single interaction. A monolithic server with fifty tools would stuff the model&#8217;s prompt with tool descriptions it does not need for the current task, crowding out space for the actual conversation.</p></li></ul><p style="text-align: justify;">Domain-specific servers keep the tool list small and relevant. This context window constraint is uniquely AI-specific because in a traditional microservices setup, you would not worry about your service catalog consuming tokens.</p><p style="text-align: justify;">The tradeoff here is more operational overhead per server, since each one needs deployment, monitoring, and ownership. This cost led directly to the third bet.</p><h3 style="text-align: justify;">Bet 3: A unified deployment pipeline</h3><p style="text-align: justify;">Early feedback from teams was clear. Spinning up a new MCP server required too much boilerplate, including deployment pipelines, service configuration, and operational setup, all before anyone could write a single line of business logic.</p><p style="text-align: justify;">The Pinterest engineering team responded by building a unified deployment pipeline. Teams define their tools, and the platform handles deployment, scaling, and infrastructure. This turned what had been a multi-day setup process into something where domain experts could focus entirely on their business logic. Without this investment, the bet around many small servers would have collapsed under its own operational weight.</p><p style="text-align: justify;">Sitting beneath all of this is the MCP registry, a central catalog that serves as the source of truth for which servers exist, who owns them, and how to connect to them. It has two faces.</p><ul><li><p style="text-align: justify;">A web UI lets humans browse available servers, see their live status, find the owning team and support channels, and inspect visible tools.</p></li><li><p style="text-align: justify;">An API lets AI clients programmatically discover servers, validate them, and check whether a given user is authorized to access a given server.</p></li></ul><p style="text-align: justify;">Only servers registered here count as approved for production use. In other words, the registry is not just a phone book, but the governance backbone of the entire ecosystem.</p><h2 style="text-align: justify;">Two Layers of Auth</h2><p style="text-align: justify;">Giving AI agents access to tools that touch real production systems and sensitive data raises immediate security concerns.</p><p style="text-align: justify;">Pinterest treated MCP as a joint project with their security team from day one, and the result is a two-layer authorization model that deserves careful attention.</p><p style="text-align: justify;">See the diagram below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!c0I9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33838bcb-4b89-4899-a9fa-192e8920a4f9_2498x1416.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!c0I9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33838bcb-4b89-4899-a9fa-192e8920a4f9_2498x1416.png 424w, https://substackcdn.com/image/fetch/$s_!c0I9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33838bcb-4b89-4899-a9fa-192e8920a4f9_2498x1416.png 848w, https://substackcdn.com/image/fetch/$s_!c0I9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33838bcb-4b89-4899-a9fa-192e8920a4f9_2498x1416.png 1272w, https://substackcdn.com/image/fetch/$s_!c0I9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33838bcb-4b89-4899-a9fa-192e8920a4f9_2498x1416.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!c0I9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33838bcb-4b89-4899-a9fa-192e8920a4f9_2498x1416.png" width="1456" height="825" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/33838bcb-4b89-4899-a9fa-192e8920a4f9_2498x1416.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:825,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:174328,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196933670?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33838bcb-4b89-4899-a9fa-192e8920a4f9_2498x1416.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!c0I9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33838bcb-4b89-4899-a9fa-192e8920a4f9_2498x1416.png 424w, https://substackcdn.com/image/fetch/$s_!c0I9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33838bcb-4b89-4899-a9fa-192e8920a4f9_2498x1416.png 848w, https://substackcdn.com/image/fetch/$s_!c0I9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33838bcb-4b89-4899-a9fa-192e8920a4f9_2498x1416.png 1272w, https://substackcdn.com/image/fetch/$s_!c0I9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33838bcb-4b89-4899-a9fa-192e8920a4f9_2498x1416.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Consider what happens when an engineer opens Pinterest&#8217;s internal AI chat and asks the agent to query revenue data from the data warehouse. That single request crosses multiple systems as mentioned below:</p><ul><li><p style="text-align: justify;">The chat frontend talks to the MCP registry to find available servers.</p></li><li><p style="text-align: justify;">The request gets routed to the Presto MCP server, which runs a query against a real database containing business-sensitive information.</p></li><li><p style="text-align: justify;">At every hop, the system needs to answer two questions. Who is this person? And are they allowed to do this specific thing?</p></li></ul><p style="text-align: justify;">Layer 1 handles coarse-grained checks at the network edge.</p><p style="text-align: justify;">When an engineer opens any AI surface at Pinterest, they go through an OAuth flow, which is the standard process for logging in with a company account and granting the application permission to act on the user&#8217;s behalf. This produces a JWT (JSON Web Token), a small signed token that encodes the user&#8217;s identity and group memberships. That JWT travels with every subsequent request.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1586!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde99cc29-698e-4c6e-aeab-206f5ef376b4_1722x1246.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1586!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde99cc29-698e-4c6e-aeab-206f5ef376b4_1722x1246.png 424w, https://substackcdn.com/image/fetch/$s_!1586!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde99cc29-698e-4c6e-aeab-206f5ef376b4_1722x1246.png 848w, https://substackcdn.com/image/fetch/$s_!1586!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde99cc29-698e-4c6e-aeab-206f5ef376b4_1722x1246.png 1272w, https://substackcdn.com/image/fetch/$s_!1586!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde99cc29-698e-4c6e-aeab-206f5ef376b4_1722x1246.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1586!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde99cc29-698e-4c6e-aeab-206f5ef376b4_1722x1246.png" width="1456" height="1054" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/de99cc29-698e-4c6e-aeab-206f5ef376b4_1722x1246.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1054,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:85140,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196933670?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde99cc29-698e-4c6e-aeab-206f5ef376b4_1722x1246.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1586!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde99cc29-698e-4c6e-aeab-206f5ef376b4_1722x1246.png 424w, https://substackcdn.com/image/fetch/$s_!1586!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde99cc29-698e-4c6e-aeab-206f5ef376b4_1722x1246.png 848w, https://substackcdn.com/image/fetch/$s_!1586!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde99cc29-698e-4c6e-aeab-206f5ef376b4_1722x1246.png 1272w, https://substackcdn.com/image/fetch/$s_!1586!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde99cc29-698e-4c6e-aeab-206f5ef376b4_1722x1246.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p style="text-align: justify;">Before a request reaches any MCP server, it passes through Envoy, a network proxy that sits in front of every service in Pinterest&#8217;s infrastructure. Envoy validates the JWT by checking the signature and expiration, then converts it into standard headers like X-Forwarded-User and X-Forwarded-Groups.</p><p style="text-align: justify;">Envoy also enforces coarse-grained access policies. These are broad rules like &#8220;the production AI chat application may talk to the Presto MCP server, but experimental servers running in the dev namespace are off-limits.&#8221; If the request violates these rules, it gets rejected before the MCP server ever sees it. Think of Envoy as the building security desk. It checks your badge and makes sure you are supposed to be in the building at all.</p><p style="text-align: justify;">Layer 2 handles fine-grained checks inside each server.</p><p style="text-align: justify;">Even if Envoy lets a request through, the MCP server applies a second layer of authorization at the individual tool level. Pinterest uses a decorator pattern on tool functions (@authorize_tool(policy=&#8217;...&#8217;)) that checks whether the specific user is allowed to invoke that specific tool.</p><p style="text-align: justify;">For example, the Presto MCP server might be reachable by many teams, but only the Ads engineering group can call a tool like get_revenue_metrics. This is like the difference between being allowed into the building and being allowed into a specific room.</p><p style="text-align: justify;">For servers that handle particularly sensitive data, Pinterest adds business-group gating. The server extracts the user&#8217;s business group membership from their JWT and checks it against an approved list before even establishing a session. This list of approved groups is set during the initial security review when the server is first registered.</p><p style="text-align: justify;">For example, even though the Presto MCP server is technically reachable from Pinterest&#8217;s broadly used AI chat interface, only specific groups like Ads, Finance, or certain infrastructure teams can actually connect and run queries. This means that turning on a powerful, data-heavy server in a popular surface does not silently expand who can see sensitive data.</p><p style="text-align: justify;">Why two layers instead of one?</p><p style="text-align: justify;">Envoy&#8217;s policies are fast, network-level checks that block obviously unauthorized traffic before it reaches any application code. The tool-level decorators handle nuanced, business-logic-specific permissions that a network proxy is not equipped to reason about. Together, they provide defense in depth. Even if one layer has a misconfiguration, the other still catches unauthorized access.</p><p style="text-align: justify;">The official MCP specification defines an OAuth 2.0 authorization flow where users authenticate with each MCP server individually, typically involving consent screens and per-server token management. Pinterest skipped this entirely. Since they control the entire internal environment, they piggyback on the auth session the user already has when they open an AI surface. There is no additional login prompt or consent dialog when a user invokes an MCP tool.</p><p style="text-align: justify;">This is simpler for end users, but only works because Pinterest owns every piece of the stack. A company relying on third-party MCP servers would likely need the per-server OAuth approach described in the spec.</p><p style="text-align: justify;">Lastly, for automated service-to-service calls where there is no human in the loop, Pinterest uses SPIFFE-based authentication. In this pattern, the calling service proves its identity through a cryptographic certificate issued by the service mesh rather than presenting a human&#8217;s JWT. Pinterest reserves this for low-risk, read-only scenarios where the blast radius is tightly constrained.</p><h2 style="text-align: justify;">Meeting Engineers Where They Already Work</h2><p style="text-align: justify;">Pinterest was deliberate about one thing. MCP could not be a science project that lived in its own separate interface. It had to show up in the tools that engineers already use every day.</p><p style="text-align: justify;">The diagram below shows how the MCP integration has been done across various surfaces at Pinterest.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zdKK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F248d1662-1ada-4da5-8cec-441567c2a9e3_2310x1236.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zdKK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F248d1662-1ada-4da5-8cec-441567c2a9e3_2310x1236.png 424w, https://substackcdn.com/image/fetch/$s_!zdKK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F248d1662-1ada-4da5-8cec-441567c2a9e3_2310x1236.png 848w, https://substackcdn.com/image/fetch/$s_!zdKK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F248d1662-1ada-4da5-8cec-441567c2a9e3_2310x1236.png 1272w, https://substackcdn.com/image/fetch/$s_!zdKK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F248d1662-1ada-4da5-8cec-441567c2a9e3_2310x1236.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zdKK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F248d1662-1ada-4da5-8cec-441567c2a9e3_2310x1236.png" width="1456" height="779" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/248d1662-1ada-4da5-8cec-441567c2a9e3_2310x1236.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:779,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:150200,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196933670?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F248d1662-1ada-4da5-8cec-441567c2a9e3_2310x1236.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zdKK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F248d1662-1ada-4da5-8cec-441567c2a9e3_2310x1236.png 424w, https://substackcdn.com/image/fetch/$s_!zdKK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F248d1662-1ada-4da5-8cec-441567c2a9e3_2310x1236.png 848w, https://substackcdn.com/image/fetch/$s_!zdKK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F248d1662-1ada-4da5-8cec-441567c2a9e3_2310x1236.png 1272w, https://substackcdn.com/image/fetch/$s_!zdKK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F248d1662-1ada-4da5-8cec-441567c2a9e3_2310x1236.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: <a href="https://medium.com/pinterest-engineering/building-an-mcp-ecosystem-at-pinterest-d881eb4c16f1">Pinterest Engineering Blog</a></figcaption></figure></div><p style="text-align: justify;">Pinterest&#8217;s internal AI chat interface is used by the majority of employees daily. The frontend automatically handles OAuth flows and returns a list of usable MCP tools scoped to the current user&#8217;s permissions. Once connected, the AI agent binds MCP tools directly into its toolset, so invoking an MCP tool feels identical to calling any other built-in capability. From the user&#8217;s perspective, they are just asking the AI to do something, and the MCP plumbing is invisible.</p><p style="text-align: justify;">Pinterest also embeds AI bots in its internal communication platform, and these bots expose MCP tools as well. Auth is handled through the registry API, just like the web interface. These bots support context-aware tool scoping, meaning certain MCP tools are restricted to certain channels. Spark MCP tools, for example, only appear in Airflow support channels. This keeps tool lists relevant to the conversation and prevents users from accidentally invoking tools that do not make sense in a given context.</p><p style="text-align: justify;">AI-enabled IDEs can pull data through the Presto MCP server on demand, so agents bring data directly into coding workflows instead of requiring engineers to switch to a separate dashboard. CLI agents provide similar capabilities for terminal-based workflows.</p><p style="text-align: justify;">The servers that see the heaviest usage reflect the most common engineering pain points.</p><ul><li><p style="text-align: justify;">The Presto MCP server is consistently the highest-traffic server because data access is a universal need across teams.</p></li><li><p style="text-align: justify;">The Spark MCP server underpins Pinterest&#8217;s AI-assisted debugging experience, where agents diagnose job failures, summarize logs, and help record structured root-cause analyses, turning noisy operational threads into reusable knowledge.</p></li><li><p style="text-align: justify;">The Knowledge MCP server acts as a general-purpose endpoint for institutional knowledge, giving agents the ability to search documentation and answer questions across internal sources.</p></li></ul><p style="text-align: justify;">Since MCP servers enable automated actions, the blast radius of a mistake is larger than if a human manually performed the same steps.</p><p style="text-align: justify;">Pinterest&#8217;s agent guidance mandates human-in-the-loop approval before any sensitive or expensive action. Agents propose actions, humans approve or reject (optionally in batches) before execution. Pinterest also uses elicitation for dangerous operations, where the AI explicitly asks the user to confirm before performing something like overwriting data in a table. This is a governance decision.</p><h2 style="text-align: justify;">Measurements</h2><p style="text-align: justify;">Pinterest built observability into the MCP ecosystem from the start rather than treating it as an afterthought. All MCP servers use a set of shared library functions that provide logging for inputs and outputs, invocation counts, exception tracing, and other telemetry out of the box. This is part of the server framework itself, so teams get observability for free when they use the unified deployment pipeline.</p><p style="text-align: justify;">At the ecosystem level, Pinterest tracks the number of registered servers and tools, invocation counts across all servers, and a north-star metric that rolls everything up into a single number. That number is the time saved. For each tool, server owners provide a &#8220;minutes saved per invocation&#8221; estimate, based on lightweight user feedback and comparison to the prior manual workflow. Multiplied by invocation counts, this gives an order-of-magnitude view of impact.</p><p style="text-align: justify;">As of January 2025, MCP servers at Pinterest were handling 66,000 invocations per month across 844 monthly active users. Using the owner-provided estimates, MCP tools were saving on the order of approximately 7,000 hours per month.</p><h2 style="text-align: justify;">Conclusion</h2><p style="text-align: justify;">Pinterest&#8217;s MCP ecosystem offers a clear blueprint for organizations building AI agents that need to act on internal systems. The pattern they established, a standard protocol, a central registry, layered auth, a unified deployment pipeline, and built-in observability, is transferable well beyond Pinterest&#8217;s specific context.</p><p style="text-align: justify;">The most important lesson is where the effort actually went. The MCP protocol gave Pinterest a shared language between AI surfaces and tools. That was necessary but far from sufficient. The registry, auth layers, deployment pipeline, and measurement framework are what turned a promising protocol into a production system handling tens of thousands of invocations per month.</p><p style="text-align: justify;">To conclude, Pinterest&#8217;s approach suggests a practical starting point of seeding a small set of high-leverage MCP servers that solve real pain points, then invest in the platform work, especially the deployment pipeline, that makes it easy for other teams to build on top. Pinterest&#8217;s unified pipeline was the unlock that turned a platform team project into an org-wide ecosystem.</p><p style="text-align: justify;"><strong>References:</strong></p><ul><li><p style="text-align: justify;"><a href="https://medium.com/pinterest-engineering/building-an-mcp-ecosystem-at-pinterest-d881eb4c16f1">Building an MCP Ecosystem at Pinterest</a></p></li><li><p style="text-align: justify;"><a href="https://www.anthropic.com/news/model-context-protocol">Introducing the Model Context Protocol</a></p></li></ul>]]></content:encoded></item><item><title><![CDATA[EP214: Claude Code vs. OpenClaw: 5 Design Dimensions]]></title><description><![CDATA[Both are highly capable, but they have key architectural differences.]]></description><link>https://blog.bytebytego.com/p/ep214-claude-code-vs-openclaw-5-design</link><guid isPermaLink="false">https://blog.bytebytego.com/p/ep214-claude-code-vs-openclaw-5-design</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Sat, 09 May 2026 15:31:10 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!oEvb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49df56c9-1f92-4f88-bd16-8cd59dab407c_2484x3002.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><a href="https://go.bytebytego.com/QAWolf_050926Headline">&#9986;&#65039; Cut your QA cycles down to minutes with QA Wolf (Sponsored)</a></h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/QAWolf_050926CTA" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cC8P!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10dd7a90-599d-4f9d-87b0-47dce8fb8a2a_1600x840.png 424w, https://substackcdn.com/image/fetch/$s_!cC8P!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10dd7a90-599d-4f9d-87b0-47dce8fb8a2a_1600x840.png 848w, https://substackcdn.com/image/fetch/$s_!cC8P!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10dd7a90-599d-4f9d-87b0-47dce8fb8a2a_1600x840.png 1272w, https://substackcdn.com/image/fetch/$s_!cC8P!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10dd7a90-599d-4f9d-87b0-47dce8fb8a2a_1600x840.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cC8P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10dd7a90-599d-4f9d-87b0-47dce8fb8a2a_1600x840.png" width="1456" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/10dd7a90-599d-4f9d-87b0-47dce8fb8a2a_1600x840.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:137490,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://go.bytebytego.com/QAWolf_050926CTA&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196927802?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10dd7a90-599d-4f9d-87b0-47dce8fb8a2a_1600x840.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cC8P!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10dd7a90-599d-4f9d-87b0-47dce8fb8a2a_1600x840.png 424w, https://substackcdn.com/image/fetch/$s_!cC8P!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10dd7a90-599d-4f9d-87b0-47dce8fb8a2a_1600x840.png 848w, https://substackcdn.com/image/fetch/$s_!cC8P!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10dd7a90-599d-4f9d-87b0-47dce8fb8a2a_1600x840.png 1272w, https://substackcdn.com/image/fetch/$s_!cC8P!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F10dd7a90-599d-4f9d-87b0-47dce8fb8a2a_1600x840.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If slow QA processes bottleneck you or your software engineering team and you&#8217;re releasing slower because of it &#8212; you need to check out QA Wolf.</p><p>QA Wolf&#8217;s AI-native service <strong>supports web and mobile apps</strong>, delivering <a href="https://go.bytebytego.com/QAWolf_050926Automated">80% automated test coverage in weeks</a> and helping teams <strong>ship 5x faster</strong> by reducing QA cycles to minutes.</p><p><a href="https://go.bytebytego.com/QAWolf_050926QAWolf">QA Wolf</a> takes testing off your plate. They can get you:</p><ul><li><p>Unlimited parallel test runs for mobile and web apps</p></li><li><p>24-hour maintenance and on-demand test creation</p></li><li><p>Human-verified bug reports sent directly to your team</p></li><li><p>Zero flakes guarantee</p></li></ul><p>The benefit? No more manual E2E testing. No more slow QA cycles. No more bugs reaching production.</p><p>With QA Wolf, <a href="https://go.bytebytego.com/QAWolf_050926Drata">Drata&#8217;s team of 80+ engineers</a> achieved 4x more test cases and <strong>86% faster QA cycles</strong>.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/QAWolf_050926CTA&quot;,&quot;text&quot;:&quot;Schedule a demo to learn more&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://go.bytebytego.com/QAWolf_050926CTA"><span>Schedule a demo to learn more</span></a></p><div><hr></div><p>This week&#8217;s system design refresher:</p><ul><li><p>Claude Code vs. OpenClaw: 5 Design Dimensions</p></li><li><p>Become an AI Engineer | Enrollment Ends Soon</p></li><li><p>How AI Fakes a Human in 5 Steps</p></li><li><p>How do you know if your AI app actually works?</p></li><li><p>Why Does Git Revert Cause Conflicts?</p></li></ul><div><hr></div><h2>Claude Code vs. OpenClaw: 5 Design Dimensions</h2><p>Claude Code terminates after every task. OpenClaw never sleeps. Both are highly capable, but they have key architectural differences.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!oEvb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49df56c9-1f92-4f88-bd16-8cd59dab407c_2484x3002.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!oEvb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49df56c9-1f92-4f88-bd16-8cd59dab407c_2484x3002.jpeg 424w, https://substackcdn.com/image/fetch/$s_!oEvb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49df56c9-1f92-4f88-bd16-8cd59dab407c_2484x3002.jpeg 848w, https://substackcdn.com/image/fetch/$s_!oEvb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49df56c9-1f92-4f88-bd16-8cd59dab407c_2484x3002.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!oEvb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49df56c9-1f92-4f88-bd16-8cd59dab407c_2484x3002.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!oEvb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49df56c9-1f92-4f88-bd16-8cd59dab407c_2484x3002.jpeg" width="1456" height="1760" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/49df56c9-1f92-4f88-bd16-8cd59dab407c_2484x3002.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1760,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!oEvb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49df56c9-1f92-4f88-bd16-8cd59dab407c_2484x3002.jpeg 424w, https://substackcdn.com/image/fetch/$s_!oEvb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49df56c9-1f92-4f88-bd16-8cd59dab407c_2484x3002.jpeg 848w, https://substackcdn.com/image/fetch/$s_!oEvb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49df56c9-1f92-4f88-bd16-8cd59dab407c_2484x3002.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!oEvb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F49df56c9-1f92-4f88-bd16-8cd59dab407c_2484x3002.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>System Scope<br>Claude Code is a short-lived process. You launch it, it runs, it exits. OpenClaw is a long-running background daemon with a Gateway that holds open WebSocket connections to apps like Discord, Slack, and WhatsApp.</p><p>Agent Runtime<br>Claude Code uses a single async query loop: think, tool call, observe, repeat. OpenClaw uses per-session queues, where the Gateway routes RPCs into separate queues.</p><p>Extension Architecture<br>Claude Code supports MCP, plug, skill, and hook, all wired into the agent. OpenClaw uses a manifest-first plugin system. Plugins flow through a central registry before reaching the Agent.</p><p>Memory <br>Claude Code treats CLAUDE. md as memory. OpenClaw separates MEMORY. md from daily notes and adds hybrid vector/keyword search across structured sections.</p><p>Multi-agent &amp; Routing<br>Claude Code uses a lead-to-subagent pattern. OpenClaw uses a route-and-delegate system where inbound channels get routed to dedicated agents that hand off to shared subagents.</p><p>Over to you: which pattern do you think is the future of agents?</p><div><hr></div><h2>Become an AI Engineer | Enrollment Ends Soon</h2><p>Our 6th cohort of <em><strong>Becoming an AI Engineer </strong></em>starts in about a week. This is a live, cohort-based course created in collaboration with <strong>best-selling author</strong> Ali Aminian and published by ByteByteGo.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/substack-bbai&quot;,&quot;text&quot;:&quot;Check it out Here&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://go.bytebytego.com/substack-bbai"><span>Check it out Here</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/substack-bbai" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kaYA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 424w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 848w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 1272w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kaYA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png" width="1456" height="1801" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1801,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:741069,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://go.bytebytego.com/substack-bbai&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196812138?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!kaYA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 424w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 848w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 1272w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 1456w" sizes="100vw" loading="lazy" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Here&#8217;s what makes this cohort special:</p><ul><li><p>Learn by doing: Build real world AI applications, not just by watching videos.</p></li><li><p>Structured, systematic learning path: Follow a carefully designed curriculum that takes you step by step, from fundamentals to advanced topics.</p></li><li><p>Live feedback and mentorship: Get direct feedback from instructors and peers.</p></li><li><p>Community driven: Learning alone is hard. Learning with a community is easy!</p></li></ul><p>We are focused on skill building, not just theory or passive learning. Our goal is for every participant to walk away with a strong foundation for building AI systems.</p><p>If you want to start learning AI from scratch, this is the perfect platform for you to begin.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/substack-bbai&quot;,&quot;text&quot;:&quot;Check it out Here&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://go.bytebytego.com/substack-bbai"><span>Check it out Here</span></a></p><div><hr></div><h2>How AI Fakes a Human in 5 Steps</h2><p>One selfie in, one fake video out. Here's how deepfakes work at a high level.</p><p>The diagram below shows the full pipeline that turns a reference image like selfie, a voice clip, and a prompt into a fake video.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0GN_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1437e3da-0c27-4eb1-b2f1-9d4240f8bd6a_2484x3002.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0GN_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1437e3da-0c27-4eb1-b2f1-9d4240f8bd6a_2484x3002.png 424w, https://substackcdn.com/image/fetch/$s_!0GN_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1437e3da-0c27-4eb1-b2f1-9d4240f8bd6a_2484x3002.png 848w, https://substackcdn.com/image/fetch/$s_!0GN_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1437e3da-0c27-4eb1-b2f1-9d4240f8bd6a_2484x3002.png 1272w, https://substackcdn.com/image/fetch/$s_!0GN_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1437e3da-0c27-4eb1-b2f1-9d4240f8bd6a_2484x3002.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0GN_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1437e3da-0c27-4eb1-b2f1-9d4240f8bd6a_2484x3002.png" width="1456" height="1760" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1437e3da-0c27-4eb1-b2f1-9d4240f8bd6a_2484x3002.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1760,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!0GN_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1437e3da-0c27-4eb1-b2f1-9d4240f8bd6a_2484x3002.png 424w, https://substackcdn.com/image/fetch/$s_!0GN_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1437e3da-0c27-4eb1-b2f1-9d4240f8bd6a_2484x3002.png 848w, https://substackcdn.com/image/fetch/$s_!0GN_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1437e3da-0c27-4eb1-b2f1-9d4240f8bd6a_2484x3002.png 1272w, https://substackcdn.com/image/fetch/$s_!0GN_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1437e3da-0c27-4eb1-b2f1-9d4240f8bd6a_2484x3002.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Step 1: Prompt Refinement. The text prompt gets cleaned, augmented with extra detail, and paired with a negative prompt to suppress unwanted artifacts like distorted hands.</p><p>Step 2: Reference Image Prep. A single selfie of the target is passed through a VAE encoder, a neural network that compresses images into a compact latent representation.</p><p>Step 3: Diffusion Inference Engine. Starts from pure noise and runs a diffusion-based denoiser, conditioned on the refined prompt, reference latent, and audio to produce clean video latents. A VAE decoder then converts those latents back into video frames.</p><p>Step 4: Post-Processing. The raw frames are upscaled to higher resolution, color-corrected for consistency, screened by an NSFW classifier, and stamped with a watermark.</p><p>Step 5: Multimodal Syncer. Audio is converted to phonemes (the distinct sound units of speech). A lip-sync model aligns mouth movements to those phonemes.</p><p>The output is a video of a CEO who never said those words, in a room they never entered.</p><p>Over to you: What do you look for to figure out if a video's real or made by AI?</p><div><hr></div><h2>How do you know if your AI app actually works?</h2><p>You evaluate it. But most teams skip this step (or do it wrong) because "eval" feels vague. It's not. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6I0C!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45b8fa27-b1f7-4edb-8e55-25f097d5f285_2484x3002.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6I0C!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45b8fa27-b1f7-4edb-8e55-25f097d5f285_2484x3002.png 424w, https://substackcdn.com/image/fetch/$s_!6I0C!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45b8fa27-b1f7-4edb-8e55-25f097d5f285_2484x3002.png 848w, https://substackcdn.com/image/fetch/$s_!6I0C!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45b8fa27-b1f7-4edb-8e55-25f097d5f285_2484x3002.png 1272w, https://substackcdn.com/image/fetch/$s_!6I0C!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45b8fa27-b1f7-4edb-8e55-25f097d5f285_2484x3002.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6I0C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45b8fa27-b1f7-4edb-8e55-25f097d5f285_2484x3002.png" width="1456" height="1760" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/45b8fa27-b1f7-4edb-8e55-25f097d5f285_2484x3002.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1760,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Image" title="Image" srcset="https://substackcdn.com/image/fetch/$s_!6I0C!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45b8fa27-b1f7-4edb-8e55-25f097d5f285_2484x3002.png 424w, https://substackcdn.com/image/fetch/$s_!6I0C!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45b8fa27-b1f7-4edb-8e55-25f097d5f285_2484x3002.png 848w, https://substackcdn.com/image/fetch/$s_!6I0C!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45b8fa27-b1f7-4edb-8e55-25f097d5f285_2484x3002.png 1272w, https://substackcdn.com/image/fetch/$s_!6I0C!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45b8fa27-b1f7-4edb-8e55-25f097d5f285_2484x3002.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Every good eval is a 3-step recipe.</p><p>Step 1: Pick a task. AI systems have different capabilities and dimensions to evaluate. For LLMs, it can be safety or math capability, in RAGs it can be grounding and retrieval, Pick one.</p><p>Step 2: Collect eval data. For every task, gather inputs paired with the right answer or expected behavior. A safety set pairs risky prompts with "refuse." </p><p>Step 3: Develop a grader. How do you decide if the output is good? </p><ul><li><p>Use code-based graders (if/else, unit tests) for things with a clear correct answer and patch passing unit-tests. </p></li><li><p>Use model-based graders (LLM-as-judge) for subjective tasks like safety.</p></li><li><p>Use human graders for edge cases and anything where nuance matters more than throughput.</p></li></ul><p>Most production evals combine all three. Code-based for what's cheap to check. Model-based for scale. Human-based for what matters most.</p><p>Over to you: what's the hardest thing about your task to grade, and which grader type do you use for it?</p><div><hr></div><h2>Why Does Git Revert Cause Conflicts?</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6UGD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F265133fd-d0f8-48c0-b170-73f6e6a49fec_1280x1605.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6UGD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F265133fd-d0f8-48c0-b170-73f6e6a49fec_1280x1605.jpeg 424w, https://substackcdn.com/image/fetch/$s_!6UGD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F265133fd-d0f8-48c0-b170-73f6e6a49fec_1280x1605.jpeg 848w, https://substackcdn.com/image/fetch/$s_!6UGD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F265133fd-d0f8-48c0-b170-73f6e6a49fec_1280x1605.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!6UGD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F265133fd-d0f8-48c0-b170-73f6e6a49fec_1280x1605.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6UGD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F265133fd-d0f8-48c0-b170-73f6e6a49fec_1280x1605.jpeg" width="1280" height="1605" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/265133fd-d0f8-48c0-b170-73f6e6a49fec_1280x1605.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1605,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;No alternative text description for this image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="No alternative text description for this image" title="No alternative text description for this image" srcset="https://substackcdn.com/image/fetch/$s_!6UGD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F265133fd-d0f8-48c0-b170-73f6e6a49fec_1280x1605.jpeg 424w, https://substackcdn.com/image/fetch/$s_!6UGD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F265133fd-d0f8-48c0-b170-73f6e6a49fec_1280x1605.jpeg 848w, https://substackcdn.com/image/fetch/$s_!6UGD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F265133fd-d0f8-48c0-b170-73f6e6a49fec_1280x1605.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!6UGD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F265133fd-d0f8-48c0-b170-73f6e6a49fec_1280x1605.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>git revert looks straightforward until it throws a conflict. Here's why that happens.</p><ul><li><p>What git revert actually does: Unlike reset, a revert doesn&#8217;t rewrite history. Instead, it creates a new commit that undoes the changes from an earlier one. This keeps your history clean, traceable, and safe for shared branches.</p></li><li><p>Why revert conflicts happen: Conflicts appear when a later commit changed the same lines as the commit you're trying to undo.</p></li></ul><p>Example in the diagram:</p><ul><li><p>Commit C2 added a feature</p></li><li><p>Commit C3 changed those same lines</p></li><li><p>Reverting C2 now collides with changes from C3</p></li></ul><p>Git can&#8217;t know which version is correct, so a revert conflict is triggered.</p><ul><li><p>How to resolve it:<br>1. Run git revert C2<br>2. Git pauses when it hits the conflict<br>3. You manually fix the file<br>4. Stage it<br>5. Continue the revert</p></li></ul><p>Git then creates a new commit that cleanly undoes C2 while keeping C3 intact.</p><p>Over to you: Have you ever hit a revert conflict at the worst possible moment? How did you resolve it?</p>]]></content:encoded></item><item><title><![CDATA[Become an AI Engineer | Enrollment Ends Soon ]]></title><description><![CDATA[Our 6th cohort of Becoming an AI Engineer starts in about a week.]]></description><link>https://blog.bytebytego.com/p/enrollment-ends-soon-become-an-ai</link><guid isPermaLink="false">https://blog.bytebytego.com/p/enrollment-ends-soon-become-an-ai</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Fri, 08 May 2026 15:31:40 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!kaYA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Our 6th cohort of <em><strong>Becoming an AI Engineer </strong></em>starts in about a week. This is a live, cohort-based course created in collaboration with <strong>best-selling author</strong> Ali Aminian and published by ByteByteGo.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/substack-bbai&quot;,&quot;text&quot;:&quot;Check it out Here&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://go.bytebytego.com/substack-bbai"><span>Check it out Here</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://go.bytebytego.com/substack-bbai" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kaYA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 424w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 848w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 1272w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kaYA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png" width="1456" height="1801" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1801,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:741069,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://go.bytebytego.com/substack-bbai&quot;,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196812138?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kaYA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 424w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 848w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 1272w, https://substackcdn.com/image/fetch/$s_!kaYA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0529d0f6-43dd-4833-b23f-9edda59836a2_2360x2920.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Here&#8217;s what makes this cohort special:</p><ul><li><p>Learn by doing: Build real world AI applications, not just by watching videos.</p></li><li><p>Structured, systematic learning path: Follow a carefully designed curriculum that takes you step by step, from fundamentals to advanced topics.</p></li><li><p>Live feedback and mentorship: Get direct feedback from instructors and peers.</p></li><li><p>Community driven: Learning alone is hard. Learning with a community is easy!</p></li></ul><p>We are focused on skill building, not just theory or passive learning. Our goal is for every participant to walk away with a strong foundation for building AI systems.</p><p>If you want to start learning AI from scratch, this is the perfect platform for you to begin.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://go.bytebytego.com/substack-bbai&quot;,&quot;text&quot;:&quot;Check it out Here&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://go.bytebytego.com/substack-bbai"><span>Check it out Here</span></a></p>]]></content:encoded></item><item><title><![CDATA[Container Design Patterns for Distributed Systems]]></title><description><![CDATA[In this article, we&#8217;ll walk through the patterns that have crystallized over the past decade, organized by the scope of their coordination.]]></description><link>https://blog.bytebytego.com/p/container-design-patterns-for-distributed</link><guid isPermaLink="false">https://blog.bytebytego.com/p/container-design-patterns-for-distributed</guid><dc:creator><![CDATA[ByteByteGo]]></dc:creator><pubDate>Thu, 07 May 2026 15:31:08 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!UbG9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1812b094-e384-4825-a301-4b942ef5976b_2250x2624.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p style="text-align: justify;">For most of their life, containers have been considered more as a deployment concern. Package your code with its dependencies, ship it as one unit, and run it the same way everywhere.</p><p style="text-align: justify;">That story was true, and it was also pretty useful, but it&#8217;s just one half of what containers were good for. The other half is what happens when we stop thinking of a container as a way to deliver one application and start thinking of it as a building block we can compose with others.</p><p style="text-align: justify;">Software engineering has been here before. In the 1990s, object-oriented programming gave application code a clean boundary we could compose against. Out of that boundary came design patterns, the small library of standard solutions every working programmer eventually internalizes. With containers, distributed systems have gone through the same transition.</p><p style="text-align: justify;">In this article, we&#8217;ll walk through the patterns that have crystallized over the past decade, organized by the scope of their coordination. Three of them describe how containers cooperate when they share a single machine. The other three describe how containers coordinate when the work spans many machines. None of these patterns is a rule. They&#8217;re answers to problems that distributed-systems engineers kept solving over and over.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UbG9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1812b094-e384-4825-a301-4b942ef5976b_2250x2624.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UbG9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1812b094-e384-4825-a301-4b942ef5976b_2250x2624.png 424w, https://substackcdn.com/image/fetch/$s_!UbG9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1812b094-e384-4825-a301-4b942ef5976b_2250x2624.png 848w, https://substackcdn.com/image/fetch/$s_!UbG9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1812b094-e384-4825-a301-4b942ef5976b_2250x2624.png 1272w, https://substackcdn.com/image/fetch/$s_!UbG9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1812b094-e384-4825-a301-4b942ef5976b_2250x2624.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UbG9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1812b094-e384-4825-a301-4b942ef5976b_2250x2624.png" width="1456" height="1698" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1812b094-e384-4825-a301-4b942ef5976b_2250x2624.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1698,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:402592,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.bytebytego.com/i/196735247?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1812b094-e384-4825-a301-4b942ef5976b_2250x2624.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UbG9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1812b094-e384-4825-a301-4b942ef5976b_2250x2624.png 424w, https://substackcdn.com/image/fetch/$s_!UbG9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1812b094-e384-4825-a301-4b942ef5976b_2250x2624.png 848w, https://substackcdn.com/image/fetch/$s_!UbG9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1812b094-e384-4825-a301-4b942ef5976b_2250x2624.png 1272w, https://substackcdn.com/image/fetch/$s_!UbG9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1812b094-e384-4825-a301-4b942ef5976b_2250x2624.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2 style="text-align: justify;">The Abstraction Layer</h2>
      <p>
          <a href="https://blog.bytebytego.com/p/container-design-patterns-for-distributed">
              Read more
          </a>
      </p>
   ]]></content:encoded></item></channel></rss>