6 Comments
User's avatar
Zak P's avatar

Great write up, thank you. I had two questions:

First, how are the image embeddings generated? Is there a model that describes the image and the output of that is converted to embeddings? Just curious because I couldn't understand how a query of "heaters" resulted in a hit for an image with a heater when no caption was present.

Secondly, I'm curious how the AI Evaluation System was created and with which AI models.

meer kats's avatar

I'd be curious to know the size of the team that worked on this

Kehinde Adeleke's avatar

This is really good. I'm building an Agentic system myself and have taken a number of lessons, especially on retrieval, evals and guardrails. Thank you for sharing and thanks to the Yelp team.

Pawel Jozefiak's avatar

The architecture breakdown is excellent. What jumped out is how much of the real work is in the data pipeline, not the LLM itself. That's the pattern I keep seeing with practical AI deployments.

I was looking at AI agent use cases across different contexts recently (https://thoughts.jock.pl/p/ai-agent-use-cases-moltbot-wiz-2026) and the companies that ship working products all share one thing: they spend 80% of effort on data quality and retrieval, 20% on the model. Everyone wants to talk about the model. Nobody wants to talk about the plumbing.

How is Yelp handling hallucination in business recommendations? That's the trust-killer for consumer-facing agents.

Fadhlullah's avatar

Thank you. Beautiful and very enlightening write up.

i was just wondering, should keyword generation not come before content source selection?