How OpenAI Scaled to 800 Million Users With…

Feb 16

In this article, we will look at the challenges OpenAI faced while scaling Postgres and how the team handled the various scenarios.

Read →

8 Comments

Devanshu

Feb 17

Why chatgpt is read-heavy

Each prompt triggers:

- Tokenization (read model vocab)

- Forward pass through billions of parameters

- Retrieval from KV cache

- Sampling

This is compute-heavy reading of model weights, not writing data.

The model weights (hundreds of GBs in large models) are:

- Loaded into GPU memory

- Read repeatedly during inference

- Rarely modified

There is no write to the model during inference.

Writes are limited to:

- Logging conversation history

- Analytics

- Abuse monitoring

Compared to inference traffic, this write volume is small.

Pawel Jozefiak

Feb 16

Infrastructure stories like this matter because they expose where AI economics become real. User growth looks exciting, but the harder question is where sustainable margins come from when model and serving costs are this high. In my own market analysis work, the winners were usually companies with distribution plus disciplined infrastructure spend, not just better demos.

I dug into those dynamics with February data on who is actually capturing value in the agent wave: https://thoughts.jock.pl/p/ai-agent-landscape-feb-2026-data Scale is impressive, but efficiency will decide who survives.

Opinion AI

Feb 16

Scaling to crazy numbers is never just servers, its the boring stuff too: support, trust, policy, and how fast you can fix breakage when millions hit the same edge case. What I worry about is success hiding the cracks. If you grow faster than your reliability and safety muscle, the public only learns after a big mess. Real moat now is not features, its uptime, tight feedback loops, and owning mistakes fast.

Yusuf

Feb 16

Same with the other comment. I don't understand how chatgpt is read heavy.

aaraarvy

Feb 16

How is CharGPT read heavy? Interactions with ChatGPT are conversations. Unless we assume users will go back to read their old chats, it cannot be read heavy. Is there any data supporting this claim?

in comparison, Social media sites are read heavy as its general understanding that more people consume and less number of people will contribute. ChatGPT is not similar.

Curios to know the reasoning.

lazycat

Feb 27Edited

It is read heavy. Fetching data chunks and grouping into a reply is a typical read. it doesn’t update db frequently or in a complicated way, so it’s not write-heavy

Barry Jones

Feb 18

This was a great read. I am curious, with all of those read replicas are you using cascading replication? I didn't see it in the write up.

Streaming from the writer to intermediary replicas that then handle the streaming load out to the replicas that are actually being accessed? It's supposed to reduce the streaming load from the writer further, so I'm curious if your team tried it and it just wasn't effective?

Feb 18

@devanshusave Like others have said Chatgpt is write-heavy! All those long responses have to be saved somewhere. Sure, initially, they can be saved in a NoSQL db to avoid clogging the single write instance. All that you have written can be served through a cache layer or an in-memory model.

ByteByteGo Newsletter

How OpenAI Scaled to 800 Million Users With…