EP173: BIGGEST Mistakes to Avoid in System Design Interviews

ByteByteGo

Jul 26, 2025

123

ByteByteGo Technical Interview Prep Kit

Launching the All-in-one interview prep. We’re making all the books available on the ByteByteGo website.

What's included:

System Design Interview
Coding Interview Patterns
Object-Oriented Design Interview
How to Write a Good Resume
Behavioral Interview (coming soon)
Machine Learning System Design Interview
Generative AI System Design Interview
Mobile System Design Interview
And more to come

Launch sale: 50% off

This week’s system design refresher:

System Design Interview – BIGGEST Mistakes to Avoid (Youtube video)
12 MCP Servers You Can Use in 2025
MCP Versus A2A Protocol
How can Cache Systems go wrong?
8 System Design Concepts Explained in 1 Diagram
SPONSOR US

System Design Interview – BIGGEST Mistakes to Avoid

12 MCP Servers You Can Use in 2025

MCP (Model Context Protocol) is an open standard that simplifies how AI models, particularly LLMs, interact with external data sources, tools, and services. An MCP server acts as a bridge between these AI models and external tools. Here are the top MCP servers:

File System MCP Server
Allows the LLM to directly access the local file system to read, write, and create directories.
GitHub MCP Server
Connects Claude to GitHub repos and allows file updates, code searching.
Slack MCP Server
MCP Server for Slack API, enabling Claude to interact with Slack workspaces.
Google Maps MCP Server
MCP Server for Google Maps API.
Docker MCP Server
Integrate with Docker to manage containers, images, volumes, and networks.
Brave MCP Server
Web and local search using Brave’s Search API.
PostgreSQL MCP Server
An MCP server that enables LLM to inspect database schemas and execute read-only queries.
Google Drive MCP Server
An MCP server that integrates with Google Drive to allow reading and searching over files.
Redis MCP Server
MCP Server that provides access to Redis databases.
Notion MCP Server
This project implements an MCP server for the Notion API.
Stripe MCP Server
MCP Server to interact with the Stripe API.
Perplexity MCP Server
An MCP Server that connects to Perplexity’s Sonar API for real-time search.

Over to you: Which other MCP Server will you add to the list?

MCP Versus A2A Protocol

The Model Context Protocol (MCP) connects AI agents to external data sources, such as databases, APIs, and files, via an MCP server, thereby enriching their responses with real-world context.

Google’s Agent-to-Agent (A2A) Protocol enables AI agents to communicate and collaborate, allowing them to delegate tasks, share results, and enhance each other’s capabilities.

MCP and A2A can be combined into a holistic architecture. Here’s how it can work:

Each AI agent (using tools like Langchain with GPT or Claude) connects to external tools via MCP servers for data access.
The external tools can comprise cloud APIs, local files, web search, or communication platforms like Slack.
Simultaneously, the AI agents can communicate with one another using the A2A protocol to coordinate actions, share intermediate outputs, and solve complex tasks collectively.

This architecture enables both rich external context (via MCP) and decentralized agent collaboration (via A2A).

Over to you: What else will you add to understand the MCP vs A2A Protocol capabilities?

How can Cache Systems go wrong?

The diagram below shows 4 typical cases where caches can go wrong and their solutions.

Thunder herd problem
This happens when a large number of keys in the cache expire at the same time. Then the query requests directly hit the database, which overloads the database.

There are two ways to mitigate this issue: one is to avoid setting the same expiry time for the keys, adding a random number in the configuration; the other is to allow only the core business data to hit the database and prevent non-core data to access the database until the cache is back up.
Cache penetration
This happens when the key doesn’t exist in the cache or the database. The application cannot retrieve relevant data from the database to update the cache. This problem creates a lot of pressure on both the cache and the database.

To solve this, there are two suggestions. One is to cache a null value for non-existent keys, avoiding hitting the database. The other is to use a bloom filter to check the key existence first, and if the key doesn’t exist, we can avoid hitting the database.
Cache breakdown
This is similar to the thunder herd problem. It happens when a hot key expires. A large number of requests hit the database.

Since the hot keys take up 80% of the queries, we do not set an expiration time for them.
Cache crash
This happens when the cache is down and all the requests go to the database.

There are two ways to solve this problem. One is to set up a circuit breaker, and when the cache is down, the application services cannot visit the cache or the database. The other is to set up a cluster for the cache to improve cache availability.

Over to you: Have you met any of these issues in production?

8 System Design Concepts Explained in 1 Diagram

Non-functional requirements define the quality attributes of a system that ensure it performs reliably under real-world conditions. Some key NFRs, along with the implementation approach, are as follows:

Availability with Load Balancers
Availability ensures that a system remains operational and accessible to users at all times. Using load balancers distributes traffic across multiple service instances to eliminate single points of failure
Latency with CDNs
Latency refers to the time delay experienced in a system’s response to a user request. CDNs reduce latency by caching content closer to users.
Scalability with Replication
Scalability is the system’s ability to handle increased load by adding resources. Replication distributes data across multiple nodes, enabling higher throughput and workload.
Durability with Transaction Log
Durability guarantees that once data is committed, it remains safe even in the event of failure. Transaction logs persist all operations, allowing the system to reconstruct the state after a crash.
Consistency with Eventual Consistency
Consistency means that all users see the same data. Eventual consistency allows temporary differences, but synchronizes replicas over time to a consistent state.
Modularity with Loose Coupling and High Cohesion
Modularity promotes building systems with well-separated and self-contained components. Loose coupling and high cohesion help achieve the same.
Configurability
Configurability allows a system to be easily adjusted or modified without altering core logic. Configuration-as-Code manages infra and app settings via version-controlled scripts.
Resiliency with Message Queues
Resiliency is a system’s ability to recover from failures and continue operating smoothly. Message queues decouple components and buffer tasks, enabling retries.

Over to you: Which other NFR or strategy will you add to the list?

SPONSOR US

Get your product in front of more than 1,000,000 tech professionals.

Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases.

Space Fills Up Fast - Reserve Today

Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing sponsorship@bytebytego.com.

ByteByteGo Newsletter

Discussion about this post