This week’s system design refresher: How Discord Stores TRILLIONS of Messages (YouTube video) Netflix's Overall Architecture How to improve API performance Branching strategies Key data terms ByteByteGo is looking for guest posts 2023 State of the Java Ecosystem Report by New Relic (Sponsored)
- Avoiding the thread-per-request model and using an event loop instead. Using non-blocking APIs whenever possible on the backend.
- Offer filtering and field projection capabilities in your API. This way, your clients will only fetch the data they're interested in, avoiding overfetching.
- For larger payloads, using HTTP chunked encoding or gRPC streaming to transfer data piece by piece.
- Limit the number of requests. Use batching instead.
- Make sure HTTP/2 is enabled.
- JSON parsing is costly. Switch to binary formats like Protobuf.
- Use PATCH instead of PUT to update just a subset of the resource.
PS. Subscribe to my free newsletter that delivers a collection of links to the latest engineering blog posts from Big Tech companies and startups twice a month: https://bigtechdigest.substack.com/
The latter one. You gain no benefit from event loop approach for CPU-bound[1] tasks and the absolute vast majority of APIs are IO-bound[2].
Unlike in Java, in NodeJS, there's no choice as the runtime is built around the idea of event loop.
If there is a choice and then event loop approach (e.g. by using libraries like Reactor or frameworks like Vert.x) makes a more efficient use of threads which are an expensive and scarce resource. Threads are not being blocking every time an external call is made, but instead it gets shifted to the next piece of work.
This, however, is an evolving topic especially with Loom project emerging.
[1] Performing CPU heavy computations.
[2] Making lots of external requests (HTTP, DB, MQ etc.)
Other ways of improving API performance:
- Avoiding the thread-per-request model and using an event loop instead. Using non-blocking APIs whenever possible on the backend.
- Offer filtering and field projection capabilities in your API. This way, your clients will only fetch the data they're interested in, avoiding overfetching.
- For larger payloads, using HTTP chunked encoding or gRPC streaming to transfer data piece by piece.
- Be careful with pagination with SQL databases as limit and offset operations are costly. See: https://stackoverflow.com/questions/4481388/why-does-mysql-higher-limit-offset-slow-the-query-down
- Limit the number of requests. Use batching instead.
- Make sure HTTP/2 is enabled.
- JSON parsing is costly. Switch to binary formats like Protobuf.
- Use PATCH instead of PUT to update just a subset of the resource.
PS. Subscribe to my free newsletter that delivers a collection of links to the latest engineering blog posts from Big Tech companies and startups twice a month: https://bigtechdigest.substack.com/
How thread-per-request is slower than event loop? You mean Java is slower than NodeJs? Or your statement is valid only for IO heavy tasks?
The latter one. You gain no benefit from event loop approach for CPU-bound[1] tasks and the absolute vast majority of APIs are IO-bound[2].
Unlike in Java, in NodeJS, there's no choice as the runtime is built around the idea of event loop.
If there is a choice and then event loop approach (e.g. by using libraries like Reactor or frameworks like Vert.x) makes a more efficient use of threads which are an expensive and scarce resource. Threads are not being blocking every time an external call is made, but instead it gets shifted to the next piece of work.
This, however, is an evolving topic especially with Loom project emerging.
[1] Performing CPU heavy computations.
[2] Making lots of external requests (HTTP, DB, MQ etc.)
Thank you. I didn't knew about project loom.
I think the definitions provided in data key terms for data lake and data mart are different in picture and text description
yes, seems defination of datamart and datalake are exchanged
Data mart and data lake in the picture is not correct(switched), while the text description is correct.
Hey Alex,
you need to swap explanation for "Data Marts" & "Data Lake" on the picture
Very good article.