System Design HLD Example: Real-Time Leaderboard
A practical interview-ready HLD for a real-time gaming leaderboard with fast rank queries.
Abstract AlgorithmsTLDR: Real-time leaderboards for 10M+ active users require an in-memory ranking engine. Redis Sorted Sets (ZSET) are the industry standard, providing $O(\log N)$ updates and rank lookups via an internal Skip List data structure. Relational databases fail at this scale because calculating rank is an $O(N)$ operation (counting rows with higher scores) that becomes a bottleneck during peak traffic spikes.
π The Last-Minute Snipe
Imagine itβs the final minute of a week-long gaming tournament. Ten million players are grinding for the top spot. Youβre currently ranked #10, and you just scored 5,000 points. You expect to see your rank jump to #2 immediately.
If the system uses a standard SQL database, your score update triggers a massive re-calculation. The database has to count exactly how many rows are now below you. With millions of players submitting scores every second, the database's write-ahead log (WAL) gets backed up, and the "Rank" query starts taking 5 seconds. By the time the leaderboard refreshes, the tournament is over, and youβve been "sniped" by someone whose score processed faster because their database partition wasn't locked.
The challenge of a leaderboard isn't storing scoresβit's ranking. You need a data structure that can handle 50,000 updates per second while providing any player's exact global rank in logarithmic time, ensuring the competition remains fair and the experience remains instantaneous.
π Real-Time Leaderboard: Use Cases & Requirements
Actors & Journeys
- Player: Plays matches, earns points, and checks their global and friends-list ranking.
- Game Server: The authoritative source that validates gameplay and submits the final score to the leaderboard service.
- Tournament Admin: Configures seasons, resets scores, and monitors for cheating behavior.
Functional Requirements
- Score Submission: Update a player's score after a match.
- Top-N Query: Retrieve the top 10, 100, or 1000 players for the global leaderboard.
- Personal Rank Query: Find a specific player's exact rank (e.g., "You are #4,502 in the world").
- Social Leaderboard: View rankings specifically among a subset of users (friends).
- Time-Bounded Views: Daily, Weekly, and All-time rankings.
Non-Functional Requirements (NFRs)
- Real-Time Accuracy: Score updates must reflect in rankings within $< 500$ ms.
- High Read Availability: 99.99% (Players check their rank far more often than they play matches).
- Scalability: Handle 100 million total registered players and 50,000 score updates/sec during peak events.
- Durability: A cache failure should not permanently lose player scores; a persistent store must act as the backup.
π Basics: How Ranking Differs from Sorting
At first glance, a leaderboard seems like a simple sorting problem. Just sort the table by score and pick the top rows. However, at scale, there is a massive difference between sorting and ranking.
- Sorting: Ordering a set of elements. In SQL, this is
ORDER BY score DESC. - Ranking: Determining the relative position of a single element within that set. In SQL, this requires counting all elements with a higher score.
In a 100M row table, sorting is slow, but ranking is catastrophic. Even with an index on the score, a database cannot easily tell you that you are "exactly the 4,502,123rd player" without traversing millions of index entries. We need a data structure that maintains the order as scores are added, rather than recalculating it on every read.
βοΈ Mechanics: The Logic of Score Updates
The core mechanic of a modern leaderboard is the Accumulation and Fan-out of score events.
- Incremental Updates: Instead of overwriting the score, we often send "score deltas" (e.g., +50 points). This allows for easier concurrent updates.
- Tie-Breaking: To ensure deterministic ranking, we use a composite score. If two players have the same points, the one who achieved it first is ranked higher. We implement this by appending a timestamp to the score:
Total_Score = (Points << 32) | (MAX_INT - Timestamp). - Pre-computed Windows: For weekly leaderboards, we don't recalculate the week from scratch. We maintain a specific Redis ZSET for each week, allowing for $O(\log N)$ updates to the active leaderboard.
π Estimations & Design Goals
The Math of Massive Ranking
- Total Players: 100 Million.
- Daily Active Users (DAU): 10 Million.
- Peak Write Throughput: 50,000 score updates/sec.
- Peak Read Throughput: 100,000 rank queries/sec.
- Memory Footprint: Each entry in Redis (Player ID + Score) $\approx 100$ bytes.
- $100M \times 100 \text{ bytes} = \mathbf{10 \text{ GB}}$. This easily fits in a single high-memory Redis instance, but we'll shard for high availability.
Design Goals
- $O(\log N)$ Rank Retrieval: Avoid $O(N)$ scans at all costs.
- Write-Back Durability: Use a message queue to ensure that even if the ranking engine is busy, no score update is ever lost.
π High-Level Design: The Ranking Pipeline
The architecture separates the fast-path ranking engine from the durable-path storage.
graph TD
Game[Game Server] -->|Score| Ingest[Ingestion Service]
Ingest -->|Kafka| MQ[Message Queue]
MQ --> Processor[Score Processor]
Processor -->|ZADD| Redis[(Redis ZSET Index)]
Processor -->|Write| Postgres[(Postgres: Score History)]
Player -->|Get Rank| API[Leaderboard API]
API -->|ZREVRANK| Redis
API -.->|Fallback| Postgres
Explanation of the Architecture: The Ingestion Service receives scores from authoritative game servers. It pushes these to Kafka to buffer spikes. The Score Processor consumes events and updates the Redis Sorted Set (ZSET), which serves as our high-speed ranking index. Simultaneously, it writes to PostgreSQL for historical audit logs and disaster recovery. The Leaderboard API serves player requests by querying Redis, providing instant feedback on global ranking.
π API Design: The Competition Contract
The API must be optimized for both single-player lookups and bulk leaderboard fetches.
| Endpoint | Method | Payload | Description |
/v1/scores | POST | {"user_id": "u1", "score": 450} | Submit a score update. |
/v1/leaderboard/global | GET | ?limit=100&offset=0 | Get a slice of the global leaderboard. |
/v1/users/{id}/rank | GET | N/A | Get a specific user's rank and score. |
Example Rank Response:
{
"user_id": "u42",
"rank": 4502,
"score": 89500,
"top_percentile": 0.05,
"surrounding_players": [
{"user_id": "u41", "rank": 4501, "score": 89510},
{"user_id": "u43", "rank": 4503, "score": 89490}
]
}
ποΈ Data Model: Schema Definitions
Durable Store (PostgreSQL)
Used for audits, historical seasons, and recovery from cache failures.
| Table | Column | Type | Notes |
scores | user_id | UUID (PK) | Unique player identifier. |
scores | total_score | BIGINT | The player's current cumulative score. |
scores | last_updated | TIMESTAMP | Used for resolving ties or troubleshooting. |
Ranking Index (Redis ZSET)
- Key:
leaderboard:{season_id}:global - Member:
user_id(String) - Score:
total_score(Double)
π§ Tech Stack & Design Choices
| Component | Choice | Rationale |
| Ranking Engine | Redis Sorted Sets | Native Skip List implementation for $O(\log N)$ ranking. |
| Durable Database | PostgreSQL | Relational integrity and easy backups for financial/historical data. |
| Message Bus | Apache Kafka | High-throughput event streaming to decouple writes. |
| API Layer | Go or Java (Spring) | High concurrency support for thousands of simultaneous API calls. |
| Service Mesh | Istio/Envoy | For load balancing and circuit breaking during peak tournament hours. |
π§ Design Deep Dive
π‘οΈ Internals: The Power of Skip Lists
A Redis Sorted Set is implemented primarily using a Skip List.
- The Problem with Linked Lists: Searching for a rank in a linked list is $O(N)$. You have to touch every node.
- The Skip List Solution: It builds "express lanes" on top of the list. Instead of checking every node, you skip over 10, 100, or 1000 nodes at a time.
- Ranking Metadata: Each node in the Redis Skip List stores a "span"βthe number of elements between it and the next node in the express lane. To find a player's rank, Redis sums these spans as it traverses the tree. This turns an $O(N)$ "count" into an $O(\log N)$ "sum of 20-30 integers," which is effectively instant even for 100M players.
π Performance Analysis: Handling Sharding and Ties
- Sharding by Player ID: If we have 1 billion players, we shard Redis by
user_id % num_shards. However, getting the global rank now requires an aggregator service that knows the score distribution of every shard. - Handling Ties: If two players have 5,000 points, who is #1?
- Strategy: Use a composite score.
Score = (Points * 10^10) + (Max_Timestamp - Current_Timestamp). - This ensures that the player who reached the score first gets the higher rank, while still keeping the primary ranking based on points.
- Strategy: Use a composite score.
π Real-World Applications: From Games to Finance
Leaderboards are not just for games. They power many critical business systems:
- E-commerce: "Top-selling products in the last hour" on Amazon.
- Finance: Real-time stock performance rankings by volume or price change.
- Fitness: Peloton and Strava leaderboards that show how you rank against others on the same ride.
- Education: Duolingo leagues where users compete to move up in rank based on experience points (XP).
βοΈ Trade-offs & Failure Modes: Accuracy vs. Performance
- Global vs. Regional Rank: In a global system, maintaining a single Redis ZSET for 100M players is possible, but it creates a single point of failure. You trade off some cross-region latency for the simplicity of a single "source of truth" in Redis.
- Eventual vs. Strong Consistency: Our architecture uses a message queue (Kafka) to update Redis. This means a player's rank is eventually consistent. It might take 200ms for their rank to update after a match. In the context of a 30-minute match, this is a perfect trade-off for the scalability it provides.
- Hot Keys: In a massive tournament, millions of updates to the same "top rank" members can cause contention. We use Redis replication and client-side sharding to distribute the read load.
- Tie-Breaker Complexity: Implementing time-based tie-breakers adds a small amount of compute overhead to the ingestion path, but it's essential for competitive fairness.
ποΈ Advanced Concepts for Production Evolution
- Time-Windowed Leaderboards: Create a new ZSET every day/week (e.g.,
lb:daily:2024-03-29). Use RedisZUNIONSTOREto calculate weekly totals from seven daily sets efficiently. - Friends Leaderboards: Don't use a ZSET for friends (since friend lists are small, $< 1000$). Instead, store friend IDs in a Redis
SET, fetch their scores individually usingZSCORE, and sort the small list in the application layer. This saves massive memory. - Surrounding Players View: Players often want to see "who is just above and below me." We use
ZREVRANKto find the user's index $I$, thenZREVRANGEwith indices $I-5$ to $I+5$ to fetch the local neighborhood of the leaderboard. - Cheat Detection: An asynchronous worker monitors the Kafka score stream. If a player's score jumps by 1M points in 1 second, the worker flags the account and removes them from the Redis ZSET pending review.
π§ Decision Guide: Choosing Your Ranking Engine
| Situation | Recommendation |
| Small Scale ($< 10k$ users) | Use SQL with an index on score. It's simple and avoids adding a new dependency. |
| Real-Time Global Ranking | Use Redis Sorted Sets. It's the industry standard for $O(\log N)$ performance. |
| High Write/Low Read | Use Cassandra. Store the scores and calculate rank asynchronously for display in an eventual-consistency model. |
| Complex Aggregations | Use Elasticsearch. It handles complex multi-field filters (e.g., "Rank among Level 10 users in France") better than Redis. |
π§ͺ Practical Example: Interview Delivery
If asked to design this in an interview, focus on these three beats:
- The SQL "Count" Bottleneck: Start by explaining why
COUNT(*) WHERE score > Xis the death of performance at scale. - The Skip List Intuition: Explain how Redis finds a rank in a sorted set. Mentioning "Logarithmic time" and "Spans" shows you understand the underlying computer science.
- The Hybrid Persistence Model: Emphasize that Redis is for the "Now" and Postgres is for the "Forever." This addresses data durability concerns.
Standard Interview Closer: "I chose a Redis-backed Sorted Set architecture because it optimizes for the most critical user experience: seeing an updated rank immediately after a match. By leveraging the $O(\log N)$ properties of Skip Lists and decoupling our write path through Kafka, we can handle 100 million players with sub-10ms response times even during peak tournament events."
π οΈ Redis Sorted Sets: How It Works in Practice
Redis commands for leaderboards are simple but extremely powerful.
Example: Updating and Querying (CLI)
# Update Player 42's score to 5000 (only if it's higher than existing)
ZADD tournament:global GT 5000 "player_42"
# Get the Top 3 Players with their scores
ZREVRANGE tournament:global 0 2 WITHSCORES
# Find the exact global rank of player_42
ZREVRANK tournament:global "player_42"
Java Integration (Redisson)
In a production Java service, you would use a distributed client to manage the ZSET:
RScoredSortedSet<String> leaderboard = redisson.getScoredSortedSet("tournament:global");
// Async update for high throughput
leaderboard.addAsync(5000.0, "player_42");
// Get the player's rank (0-based)
Integer rank = leaderboard.revRank("player_42");
π Lessons Learned
- Architecture is about Complexity: Rank is a global property of a set; never try to calculate it locally or at read-time without a specialized index.
- Memory is Cheaper than Latency: 10 GB of RAM for a leaderboard is a bargain compared to the engineering effort of making a 100M-row SQL query fast.
- Fail Gracefully: If Redis is down, show the user their "Last Known Rank" from Postgres and put a "Refreshing..." icon. Don't let the whole app crash.
π Summary & Key Takeaways
- Sorted Sets (ZSET): The heart of real-time ranking.
- Skip Lists: The data structure that enables $O(\log N)$ rank lookups.
- Kafka: Essential for handling write-volatility and ensuring durability.
- Composite Scores: The best way to resolve ties (e.g., using timestamps).
π Practice Quiz
What is the time complexity of finding a player's rank in a Redis Sorted Set?
- A) $O(1)$
- B) $O(N)$
- C) $O(\log N)$ Correct Answer: C
Why is it a bad idea to use a standard SQL
ORDER BY score DESC LIMIT 10for a 100M player leaderboard?- A) SQL cannot store 100M rows.
- B) The database must sort or scan millions of rows to find the exact rank, which is too slow for real-time.
- C) SQL scores are not accurate. Correct Answer: B
How do you handle a "Weekly Leaderboard" using Redis?
- A) By deleting all scores every Sunday.
- B) By using a separate ZSET keyed with the week number (e.g.,
lb:2024:w13). - C) By adding a "week" column to the Redis ZSET. Correct Answer: B
[Open-ended] Imagine you have so many players that they don't fit in one Redis instance. How would you design a "Global Leaderboard" across three shards?
π Related Posts

Written by
Abstract Algorithms
@abstractalgorithms
More Posts

Adapting to Virtual Threads for Spring Developers
TLDR: Platform threads (one OS thread per request) max out at a few hundred concurrent I/O-bound requests. Virtual threads (JDK 21+) allow millions β with zero I/O-blocking cost. Spring Boot 3.2 enables them with a single property. Avoid synchronized...

Java 8 to Java 25: How Java Evolved from Boilerplate to a Modern Language
TLDR: Java went from the most verbose mainstream language to one of the most expressive. Lambdas killed anonymous inner classes. Records killed POJOs. Virtual threads killed thread pools for I/O work.
Data Anomalies in Distributed Systems: Split Brain, Clock Skew, Stale Reads, and More
TLDR: Distributed systems produce anomalies not because the code is buggy β but because physics makes it impossible to be perfectly consistent, available, and partition-tolerant simultaneously. Split brain, stale reads, clock skew, causality violatio...
Sharding Approaches in SQL and NoSQL: Range, Hash, and Directory-Based Strategies Compared
TLDR: Sharding splits your database across multiple physical nodes so no single machine carries all the data or absorbs all the writes. The strategy you choose β range, hash, consistent hashing, or directory β determines whether range queries stay ch...
