Performance by Design, Optimization by Necessity
This is one of those articles I have been meaning to write for a long time. The title has more to it than what meets the eye.
Introduction
When designing systems, it is essential to keep certain principles in mind. One of the core principles that I follow is Performance by Design
. We will go into the depth of what it means and importance of it.
Premature Optimization is the root of all evil - Donald Knuth
Performance By Design
Performance by design is more nuanced than you think. What it truly means is that you design for 80% of use cases. The rest 20% of use cases can be improved by optimization. Let’s take an example.

A CRUD app (backend) that takes in simple database inputs to store user records
Note : Ignoring the Front End part for this example
You would probably use one of the mainstream programming languages like Python, Golang, Java, etc.,
A famous web framework like Flask (Python)/ Gin Gonic (Golang)/ Quarkus (Java).
Use a relational database like PostgreSQL/MySQL.
Host your servers on a public cloud like AWS/Azure/GCP.
Now what would we be “performance by design” for a simple application like this?
Scale - What if you scale beyond a server?
How would you provision a new server and how quickly can you provision one? → Horizontal Scaling
How would you distribute traffic between these nodes?
Database Scale
How can you easily sharding/multi-tenancy?
Cost
Can you scale down easily when there is less traffic?
Security
How to ensure security? Would you roll out something custom?
There are other questions that you can ask but the fundamental principle is that do you have a plan for the 80%?
A simple architecture diagram that I would design using the “performance by design” paradigm would look like this -
I have abstracted out a lot of details like Firewalls, Networking, etc ., but what I am trying to present here is an architecture that optimizes for the 80% ask.
Kubernetes
Gives load balancing out of the box.
Hardware abstraction - You do not worry about hardware failures, k8s takes care of it.
Easy slice and dice of hardware to deploy small services or choose bigger boxes. Completely up to us.
Can scale up and scale down easily.
Database
Can use the region as a sharding mechanism.
Fault-tolerant by nature as one region's failure does not affect the other.
Note: This might not be ideal, you might be better off choosing services like Cloud Run (GCP) or AWS App Runner instead of a full Kubernetes deployment but the point still stands.
Optimization By Necessity
You run this application for a few months and your traffic sees a massive uptick, then what you do is probably bring specialized performance enhancements like
Note: Database examples are from PostgreSQL.
Specialized Indexes like BRIN to improve range scans on large tables.
Search Indexes like GIN.
No, don’t jump into a full-blown ELK stack just yet. Most modern relational databases have decent search capabilities.
Bring in a cache layer like Redis or Memcached.
Remember that databases have good levels of caching. Do you need an application-level cache?
CDN for static assets.
Do you have a lot of static assets and a distributed user base?
Measuring end user latency is tricky business, so do it right.
Concurrency and Parallelism (Other than what the language/framework gives you out of the box).
The rule for optimization is that -
You Cannot Improve What You Cannot Measure - Peter Drucker
Are you sure that you need that 20% squeeze or is there a need to bring in specialized systems?
Conclusion
“Performance by Design & Optimization by Necessity” is important as it will lead to
Less service disruptions. You will have reasonable time to predict the remaining 20% and solve it.
Improved business outcome. You are not getting into firefighting the moment you have released your application.
Less cost. Both people and infrastructure cost.
Focus and spend time only on things that can be measured and improved.
When designing systems, ask yourself that if you are prepared for the 80:20. It is equally important to design systems which are performant from day 1 while also not do premature optimization at the beginning itself. The examples given may or may not be 100% accurate as real world design depends on a lot of other factors but it gives you a general framework of thought.