Ctrl+C, Ctrl+V: A System Design Strategy
The Buzzword Arms Race
If you have ever prepared for a system design interview, you have probably been attacked by the same army of buzzwords again and again: use a load balancer, add caching, scale horizontally, use database replicas, put a CDN in front, add Kafka, add Redis, add Kubernetes, and then add three more cloud services because apparently the diagram still has some empty space.
After a while, system design starts feeling less like engineering and more like a game where whoever says the most infrastructure buzzwords wins. The problem is that most tutorials teach patterns instead of problems. So people memorize twenty different solutions without realizing that many of them are just different versions of the same idea wearing different costumes.
When Success Becomes a Problem
Let us start with one of the most common system design problems: too much traffic. Imagine you have a server that can handle 1,000 requests per second. Life is good. The dashboards are green, alerts are quiet, and nobody is threatening to wake you up at 3 AM. Then your product becomes successful, which is unfortunate because success is one of the leading causes of infrastructure failure.
Suddenly, instead of 1,000 requests per second, you need to handle 10,000. The same server that was happily serving users yesterday is now gasping for air and questioning its career choices. At this point, most people start throwing solutions around: add more application servers, add database replicas, use a CDN, add cache servers, and put a load balancer in front.
At first glance, these look like completely different solutions. In reality, they are all variations of the same strategy: create more capacity and route traffic to that capacity.
Horizontal scaling creates more application servers. Database replication creates more copies of your database. CDNs create more copies of your content around the world. Caches create more copies of frequently accessed data. The technology changes, the logos on the architecture diagram change, and the consultant's PowerPoint slides become more expensive, but the idea remains the same. One machine cannot handle the workload, so use more machines.
The Load Balancer Illusion
This realization also changes how you think about load balancers. Most people talk about load balancing as if it is the actual solution, but it is not. A load balancer is basically a traffic cop. If you have one server that can handle 1,000 requests per second, putting a load balancer in front of it still gives you a system that can handle exactly 1,000 requests per second. Congratulations, you now have a highly organized bottleneck.
The actual capacity comes from creating more servers, more database replicas, more cache nodes, or more CDN locations. The load balancer simply helps distribute traffic across those resources. This is an important distinction because it tells you where the real scaling happens. Load balancing helps use capacity, but it does not create capacity by itself.
The Universe Sends a Bill
Of course, the moment you create copies, the universe sends you an invoice. You thought you were solving a traffic problem, but now you are in the distributed systems business. Distributed systems have a unique talent: every time you solve one problem, they reward you with two brand-new problems for free. It is basically a buy-one-get-two offer that nobody asked for.
The Synchronization Problem
Do copies agree?A user updates their profile picture. The primary database knows immediately, but the replica may still have the old value. The user refreshes and sees the old picture. From the user's perspective, the system looks broken. From the engineer's perspective, everything is working exactly as designed.
The Ownership Problem
Who gets to write?Two database servers both accept updates. One adds ₹100, another subtracts ₹50. Both think they have the latest reality. When they eventually talk to each other, someone has to decide which reality wins.
Agree Now or Agree Later
Most synchronization discussions eventually collapse into a very simple question: do you want everyone to agree now, or are you okay with them agreeing later? The first option is strong consistency. Every copy must agree before the operation is considered complete. This keeps the data accurate, but it is slower because multiple systems need to coordinate before responding. The second option is eventual consistency. One copy gets updated immediately, and the others catch up afterward. This scales better and responds faster, but for a short period of time different copies may disagree.
The correct answer depends entirely on the business problem. Instagram can survive if a post briefly shows 1,001 likes instead of 1,002 likes. A bank cannot survive if your account balance briefly shows ₹10,000 instead of ₹1,000. People become surprisingly passionate about consistency when their money is involved.
One Responsible Adult
The second major problem is ownership. Reading data is peaceful because everybody can look at a document and nod politely. Writing data is where the fistfights start. Imagine two database servers are both allowed to accept updates. One server receives a request to add ₹100. Another server receives a request to subtract ₹50. Both servers think they have the latest version of reality. When they eventually talk to each other, someone has to decide which reality wins.
This is the ownership problem: who gets to modify the truth? Most systems solve this by making one machine the responsible adult in the room. One server becomes the leader, and everyone else follows. The leader accepts writes, and the followers replicate the results. It may not be the most glamorous architecture in the world, but it dramatically reduces the number of existential arguments your databases can have.
Famous Systems, Same Ideas
Once you start looking at systems this way, many famous architectures become surprisingly easy to understand. Netflix primarily has a traffic problem. Movies are wonderfully cooperative pieces of data because once they are uploaded, they mostly sit there doing absolutely nothing. Netflix can copy them to CDN locations around the world and serve users from the nearest location. Since movies rarely change, synchronization is relatively easy.
JioHotstar follows a similar strategy during a cricket match. The video stream is copied everywhere because millions of users want the same content at the same time. The live score changes more frequently, but users generally do not mind if it is a couple of seconds behind. Most people are too busy arguing about the umpire anyway.
Instagram is another system that embraces eventual consistency. If a like count is briefly wrong, nobody notices. Even if somebody notices, they are unlikely to launch a formal investigation. Banking systems live in a completely different universe. A missing rupee is treated with significantly more seriousness than a missing like, so banking systems often sacrifice scalability and speed in exchange for stronger consistency guarantees.
Google Docs is where things get truly interesting. Instead of having a single writer, it allows many people to modify the same document simultaneously. This requires sophisticated conflict-resolution techniques that make ordinary database replication look refreshingly simple. It is not just serving traffic; it is trying to stop multiple humans from editing the same sentence into a crime scene.
Three Questions That Explain Everything
The biggest lesson I learned from studying system design is that the industry often makes things sound more complicated than they really are. Most large-scale systems can be understood by asking three simple questions:
What is the bottleneck?
Identify the resource that cannot keep up — compute, storage, bandwidth, or latency.
Can I create more copies?
Scale by replicating the bottleneck — more servers, more replicas, more cache nodes, more CDN locations.
How do I stop those copies from arguing with each other?
Manage the consequences — synchronization, consistency, and ownership.
That is the core idea. The databases change, the cloud providers change, the buzzwords change, and the conference talks somehow keep getting longer. But the underlying problems remain remarkably consistent.
Most system design interviews make it sound as if there are hundreds of magical patterns to memorize. In reality, a surprising amount of large-scale architecture can be reduced to creating more copies and then dealing with the consequences. Everything else is mostly implementation details, architecture diagrams, and engineers giving fancy names to the same handful of problems.