Discussion about this post

User's avatar
Roshan's avatar

What this really shows is that “foundation model” in practice is an exercise in modular intelligence, not just model size.

Grab is basically doing three things that look very modular:

• A unified representational layer over messy, heterogeneous signals (tabular + sequences + locations etc.), but used as a shared module feeding many small, task-specific models rather than one monolith.

• Dual representations of each user – long-term behavioral “identity” and short-term intent – which is exactly the split between slow, structural state and fast, contextual state you’d want in a modular system.

• “Embeddings as a product” is an org-level API design decision: teams don’t build their own “intelligence”; they consume a standardized intelligence primitive and plug it into their own specialized decision modules.

From a modular intelligence lens, the interesting part isn’t “Grab built a foundation model”, it’s that they’ve turned representation learning into a core platform module and left prediction, ranking, fraud, ads, etc. as composable downstream agents wired to the same semantic substrate.

Expand full comment
Prakash A's avatar

Excellent work. Well done Grab Engineering Team🔥 Thanks ByteByteGo for capturing it nicely.

Expand full comment

No posts

Ready for more?