Get the playbook for debugging and testing Airflow pipelines (Sponsored)
Even the most experienced Airflow users will inevitably encounter task failures and DAG errors. Join Astronomers webinar on August 6 to learn how to troubleshoot Airflow issues like a pro, before they hit production. You’ll learn:
Common DAG and task issues and how to debug them
How to write DAG unit tests
How to automate tests as part of a CICD workflow
This week’s system design refresher:
20 System Design Concepts You Must Know - Final Part (Youtube video)
Top 5 common ways to improve API performance
REST API Vs. GraphQL
ByteByteGo Technical Interview Prep Kit
Tokens vs API Keys
The AWS Tech Stack
5 Data Structures That Make DB Queries Super Fast
SPONSOR US
20 System Design Concepts You Must Know - Final Part
Top 5 common ways to improve API performance
Result Pagination:
This method is used to optimize large result sets by streaming them back to the client, enhancing service responsiveness and user experience.Asynchronous Logging:
This approach involves sending logs to a lock-free buffer and returning immediately, rather than dealing with the disk on every call. Logs are periodically flushed to the disk, significantly reducing I/O overhead.Data Caching:
Frequently accessed data can be stored in a cache to speed up retrieval. Clients check the cache before querying the database, with data storage solutions like Redis offering faster access due to in-memory storage.Payload Compression:
To reduce data transmission time, requests and responses can be compressed (e.g., using gzip), making the upload and download processes quicker.Connection Pooling:
This technique involves using a pool of open connections to manage database interaction, which reduces the overhead associated with opening and closing connections each time data needs to be loaded. The pool manages the lifecycle of connections for efficient resource use.
Over to you: What other ways do you use to improve API performance?
REST API Vs. GraphQL
When it comes to API design, REST and GraphQL each have their own strengths and weaknesses.
REST
Uses standard HTTP methods like GET, POST, PUT, DELETE for CRUD operations.
Works well when you need simple, uniform interfaces between separate services/applications.
Caching strategies are straightforward to implement.
The downside is it may require multiple roundtrips to assemble related data from separate endpoints.
GraphQL
Provides a single endpoint for clients to query for precisely the data they need.
Clients specify the exact fields required in nested queries, and the server returns optimized payloads containing just those fields.
Supports Mutations for modifying data and Subscriptions for real-time notifications.
Great for aggregating data from multiple sources and works well with rapidly evolving frontend requirements.
However, it shifts complexity to the client side and can allow abusive queries if not properly safeguarded
Caching strategies can be more complicated than REST.
The best choice between REST and GraphQL depends on the specific requirements of the application and development team. GraphQL is a good fit for complex or frequently changing frontend needs, while REST suits applications where simple and consistent contracts are preferred.
ByteByteGo Technical Interview Prep Kit
Launching the All-in-one interview prep. We’re making all the books available on the ByteByteGo website.
What's included:
System Design Interview
Coding Interview Patterns
Object-Oriented Design Interview
How to Write a Good Resume
Behavioral Interview (coming soon)
Machine Learning System Design Interview
Generative AI System Design Interview
Mobile System Design Interview
And more to come
Tokens vs API Keys
Both tokens (such as JWT) and API keys are used for authentication and authorization, but they serve different purposes. Let’s understand the simplified flow for both.
The Token Flow
End user logs into the frontend web application.
Login credentials are sent to the Identity Service.
On successful authentication, a JWT token is issued and returned.
The frontend makes API calls with the JWT in the Authorization header.
API Gateway intercepts the request and validates the JWT (signature, expiry, and claims).
If valid, the gateway sends a validation response.
The validated request is forwarded to the user-authenticated service.
The service processes the request and interacts with the database to return results.
The API Key Flow
A 3rd party developer registers on the Developer Portal.
The portal issues an API Key.
The key is also stored in a secure key store for later verification.
The developer app sends future API requests with the API Key in the header.
The API Gateway intercepts the request and sends the key to the API Key Validation service.
The validation service verifies the key from the key store and responds.
For valid API keys, the gateway forwards the request to the public API service.
The service processes it and accesses the database as needed.
Over to you: What else will you add to the explanation?
The AWS Tech Stack
Frontend
Static websites are hosted on S3 and served globally via CloudFront for low latency. Other services that support the frontend development include Amplify, Cognito, and Device Farm.API Layer
API Gateway and AppSync expose REST and GraphQL APIs with built-in security and throttling. Other services that work in this area are Lambda, ELB, and CloudFront.Application Layer
This layer hosts business logic. Some services that are important in this layer are Fargate, EKS, Lambda, EventBridge, Step Functions, SNS, and SQS.Media and File Handling
Media is uploaded to S3, transcoded via Elastic Transcoder, and analyzed using Rekognition for moderation. CloudFront signed URLs ensure secure delivery of videos and files to authenticated users.Data Layer
The primary services for this layer are Aurora, DynamoDB, ElastiCache, Neptune, and OpenSearch.Security and Identity
Some AWS services that help in this layer of the stack are IAM, Cognito, WAF, KMS, Secrets Manager, and CloudTrail.Observability and Monitoring
CloudWatch monitors logs, metrics, and alarms. X-Ray provides tracing of request paths. CloudTrail captures API calls. Config ensures compliance, and GuardDuty detects security threats.CI/CD and DevOps
The key services used in this layer are CodeCommit, CodeBuild, CodeDeploy, CodePipeline, CloudFormation, ECR, and SSM.Multi-Region Networking
Route 53 and Global Accelerator ensure fast DNS and global routing. VPC segments the network while NAT and Transit Gateways handle secure traffic flow. AWS Backup provides disaster recovery across regions.
Over to you: Which other service will you add to the list?
5 Data Structures That Make DB Queries Super Fast
Data structures are crucial for database indexes because they determine how efficiently data can be searched, inserted, and retrieved, directly impacting query performance.
B-Tree Index
B-Tree indexes use a balanced tree structure where keys and data pointers exist in internal and leaf nodes. They support efficient range and point queries through ordered traversal.B+ Tree Index
B+ Tree indexes store all data pointers in the leaf nodes, while internal nodes hold only keys to guide the search. Leaf nodes are linked for fast range queries via sequential access.Hash Index
Hash indexes apply a hash function to a search key to directly locate a bucket with pointers to data rows. They are optimized for equality searches but not for range queries.Bitmap Index
Bitmap indexes represent column values using bit arrays for each possible value, allowing fast filtering through bitwise operations. They’re ideal for low-cardinality categorical data.Inverted Index
Inverted indexes map each unique term to a list of row IDs containing that term, enabling fast full-text search.
Over to you: Which other data structure will you add to the list?
SPONSOR US
Get your product in front of more than 1,000,000 tech professionals.
Our newsletter puts your products and services directly in front of an audience that matters - hundreds of thousands of engineering leaders and senior engineers - who have influence over significant tech decisions and big purchases.
Space Fills Up Fast - Reserve Today
Ad spots typically sell out about 4 weeks in advance. To ensure your ad reaches this influential audience, reserve your space now by emailing sponsorship@bytebytego.com.