All Posts

System Design Networking: DNS, CDNs, and Load Balancers

Abstract AlgorithmsAbstract Algorithms
··3 min read

TL;DR

This post explains the "Traffic Cops" of the internet: DNS (the phonebook), CDNs (the local cache), and Load Balancers (the distributors).

Cover Image for System Design Networking: DNS, CDNs, and Load Balancers

1. DNS (Domain Name System)

DNS is the phonebook of the internet. Computers don't know what google.com is; they only know IP addresses like 172.217.0.0.

How It Works (The Lookup)

  1. Browser Cache: "Have I visited google.com recently?"

  2. OS Cache: "Does Windows/Linux know this IP?"

  3. Resolver (ISP): Your Internet Provider checks its cache.

  4. Root Server: "I don't know google.com, but I know the .com server."

  5. TLD Server (.com): "I know the server that handles google.com."

  6. Authoritative Nameserver: "Here is the IP: 172.217.0.0."

Interview Tip

DNS lookup takes time (20ms - 100ms). When designing a latency-sensitive system, mention that the first request will be slower due to DNS resolution.


2. CDN (Content Delivery Network)

A CDN is a network of servers distributed globally (Edge Locations) that store copies of your static content (images, CSS, JS, videos).

Why Use It?

  • Latency: If your server is in New York and the user is in London, the signal has to travel under the ocean (slow). A CDN serves the file from a London server (fast).

  • Load: It takes the heavy lifting off your main servers. 90% of traffic (static files) hits the CDN, not your database.

Push vs. Pull CDNs

  • Push: You manually upload files to the CDN. Good for stable content (Netflix movies).

  • Pull: The CDN fetches the file from your server the first time a user asks for it, then caches it. Good for dynamic sites (Profile pictures).


3. Load Balancers (LB)

A Load Balancer sits in front of your servers and distributes incoming traffic. It is the gatekeeper.

Types of Load Balancers

  1. Layer 4 (Transport Layer):

    • Routes based on IP Address and Port.

    • Pros: Extremely fast, simple.

    • Cons: Dumb. Can't see the content. Can't route /video differently from /chat.

  2. Layer 7 (Application Layer):

    • Routes based on URL, Cookies, Headers.

    • Pros: Smart. Can route mobile users to mobile servers, or /api to API servers.

    • Cons: Slower (needs to decrypt HTTPS and inspect packets).

Load Balancing Algorithms

How does the LB decide which server gets the request?

  1. Round Robin:

    • Sequential: Server 1, Server 2, Server 3, Server 1...

    • Pros: Simple.

    • Cons: Assumes all servers are equally powerful and all requests are equally hard.

  2. Least Connections:

    • Sends traffic to the server with the fewest active connections.

    • Pros: Good for long-lived connections (Chat, Streaming).

  3. IP Hash:

    • Hashes the user's IP address to pick a server.

    • Pros: The same user always goes to the same server (Sticky Session).

    • Cons: If that server dies, the user loses their session.

Consistent Hashing (The Advanced Concept)

Used in distributed caches and databases (like Cassandra).

  • Problem: In standard hashing (hash(key) % n_servers), if you add 1 server, almost all keys get remapped to different servers. This crashes your cache.

  • Solution: Consistent Hashing maps servers and keys to a "Ring". Adding a server only affects the keys directly next to it on the ring.

  • Benefit: Minimizes data movement when scaling up or down.


4. Reverse Proxy

A Reverse Proxy (like Nginx or HAProxy) sits in front of your web servers.

Reverse Proxy vs. Load Balancer

  • Load Balancer: Distributes traffic to multiple servers.

  • Reverse Proxy: Can do load balancing, but also:

    • SSL Termination: Decrypts HTTPS so your internal servers don't have to (saves CPU).

    • Compression: Gzips responses.

    • Security: Hides the real IP of your internal servers.

    • Caching: Serves static files.


Summary

  • DNS resolves names to IPs.

  • CDNs cache static content close to the user to reduce latency.

  • Load Balancers distribute traffic. Layer 7 is smarter; Layer 4 is faster.

  • Consistent Hashing is critical for distributed systems to handle scaling gracefully.

Next Up: How do services talk to each other? We'll explore Protocols: REST, RPC, and TCP/UDP.

Abstract Algorithms

Written by

Abstract Algorithms

@abstractalgorithms