System Design HLD Example: E-Commerce Platform (Amazon)
A practical interview-ready HLD for a large-scale e-commerce system handling catalog, cart, inventory, and orders.
Abstract AlgorithmsAI-assisted content. This post may have been written or enhanced with AI tools. Please verify critical information independently.
TLDR: A large-scale e-commerce platform separates catalog, cart, inventory, orders, and payments into independent microservices. The core architectural challenge is Inventory Correctness during flash salesโsolved with a two-phase reservation pattern: an atomic Redis
DECRfor high-speed "soft" reservation and an optimistic-lock SQL update for the final "hard" commitment.
๐๏ธ The Prime Day Pressure Cooker
Imagine itโs 12:00 PM on Prime Day. A "Lightning Deal" for the latest iPhone goes live at 90% off. There are exactly 1,000 units available. In the first 10 seconds, 500,000 users click "Buy Now."
In a naive system, every click triggers a database transaction: SELECT stock FROM inventory WHERE sku_id = 'iphone15'. If stock > 0, then UPDATE inventory SET stock = stock - 1. Under this massive concurrent load, the database's row-level locking will cause a "thundering herd" effect. Database connections will max out, latency will spike from milliseconds to minutes, andโworst of allโthe system might accidentally sell 1,050 units because of a race condition between the check and the decrement.
This is the Overselling Trap. In e-commerce, selling an item you don't have isn't just a technical bug; it's a financial and reputational disaster involving cancelled orders, refund processing fees, and lost customer trust. If you design for the average day, you fail on the only day that matters.
๐ E-Commerce Systems: Use Cases & Requirements
Actors
- Shopper / Buyer: Browses products, manages a cart, and places orders.
- Merchant / Seller: Lists products, manages stock levels, and fulfills orders.
- System Admin: Monitors platform health, manages global promotions, and handles fraud.
Functional Requirements
- Catalog Management: Search and browse millions of products with filters.
- Shopping Cart: Persistent cart that works across devices and handles guest users.
- Inventory Reservation: Atomic stock subtraction that prevents overselling.
- Order Processing: State machine tracking from
PlacedtoDelivered. - Payment Integration: Secure, idempotent payment processing via third-party gateways.
Non-Functional Requirements
- High Read Availability: Browsing the catalog should never be down (99.99%).
- Strong Write Consistency: Inventory and Order records must be 100% accurate.
- Low Latency: Product pages must load in < 100ms to prevent conversion drop-off.
- Scale: Support 1M+ concurrent users and 10k+ orders/sec during peak spikes.
๐ Basics: Baseline Architecture
An e-commerce system is essentially a Distributed State Machine. Every order moves through a series of transitions, and the system must ensure that data remains consistent across multiple specialized services.
The baseline architecture separates concerns into:
- The Read Path (Catalog & Search): Optimized for massive scale and eventual consistency.
- The Write Path (Checkout & Payment): Optimized for ACID compliance and strong consistency.
- The Async Path (Notifications & Analytics): Decoupled from the critical user path to ensure high performance.
Without this separation, a spike in checkouts would slow down users who are just browsing, leading to a massive loss in potential revenue.
โ๏ธ Mechanics: The Two-Phase Reservation Logic
The most critical mechanic in e-commerce is how we handle the "Check-and-Reserve" of inventory. We use a Hybrid Two-Phase Pattern:
- Phase 1: Soft Reservation (Redis): We keep a high-speed counter in Redis. Every checkout attempt performs an atomic
DECR. If the result is $\ge 0$, the user proceeds. This happens in $\approx 2ms$. - Phase 2: Hard Commitment (Postgres): Once payment is authorized, we write the reservation to the primary database using an optimistic lock. If this fails, we "compensate" by incrementing the Redis counter back.
This mechanic allows us to handle 50k requests per second on a single SKU without locking our primary database.
๐ Estimations & Design Goals
The Math of Amazon-Scale
- Total Products: 100 Million SKUs.
- Peak Orders: 10,000 orders per second.
- Read-to-Write Ratio: 50:1. If we have 10k orders/sec, we have 500k product views/sec.
- Storage Growth: 10k orders/sec 2KB per order = *1.7 GB of data per minute.
Design Goal: We must use a Cache-Aside Pattern for the catalog and a Message-Driven Architecture for post-order processing to ensure the "Buy Now" button remains responsive even if the notification service is lagging.
๐ High-Level Design: The Distributed Microservices Architecture
The following architecture illustrates the separation of concerns between discovery, transaction, and fulfillment.
graph TD
User((User)) --> LB[Load Balancer]
LB --> AG[API Gateway]
subgraph Discovery_Discovery
AG --> PCS[Product Service]
PCS --> PDB[(Catalog DB: Postgres)]
PCS --> RC[(Catalog Cache: Redis)]
AG --> SES[Search Service]
SES --> ES[(Elasticsearch)]
end
subgraph Transaction_Path
AG --> CS[Cart Service]
CS --> RCart[(Cart Cache: Redis)]
AG --> OS[Order Service]
OS --> IS[Inventory Service]
IS --> RInv[(Inv Counter: Redis)]
IS --> PInv[(Inv DB: Postgres)]
OS --> PS[Payment Service]
end
subgraph Fulfillment_Async
OS --> Kafka[Kafka]
Kafka --> NS[Notification Service]
Kafka --> AS[Analytics Service]
Kafka --> WS[Warehouse Service]
end
The diagram above reveals the three-path separation that makes Amazon-scale e-commerce resilient. The Discovery Path uses Elasticsearch and Redisโboth read-optimizedโso that browsing the catalog never touches the transactional database. The Transaction Path contains all ACID operations. The Fulfillment Path is entirely asynchronous via Kafka, meaning a slow notification service has zero impact on the checkout experience.
๐ง Deep Dive: How the Two-Phase Reservation Prevents the Overselling Trap
The Two-Phase Reservation pattern is the core innovation that separates a production e-commerce platform from a demo. Understanding its internals explains why the inventory system can sustain 50,000 simultaneous checkout attempts on a single SKU.
Internals: Phase 1 (Soft Reserve) and Phase 2 (Hard Commit) State Machine
Phase 1 is a single atomic decrement on a Redis counter for the SKU. Redis processes this as a single CPU instruction โ truly atomic even under extreme concurrency. If the counter drops below zero, the operation is immediately reversed and the user receives an "out of stock" response. At no point does Phase 1 touch Postgres, which is what gives it its ~2 ms speed.
| Field | Type | Description |
| sku_id | VARCHAR(50) | Unique product identifier |
| name | TEXT | Product display name |
| base_price | DECIMAL(10,2) | Listed price before promotions |
| stock_physical | INTEGER | Actual warehouse on-hand count |
| stock_reserved | INTEGER | Soft-reserved count held in Redis |
| version | INTEGER | Optimistic lock counter for Phase 2 |
| status | ENUM | ACTIVE, DISCONTINUED, OUT_OF_STOCK |
Phase 2 โ Hard Commitment โ runs only after the payment gateway returns a successful charge authorization. The Order Service issues a Postgres UPDATE that reads the current version number, performs the stock decrement, and asserts the version has not changed. If another concurrent transaction modified the row between the read and the write, the version will differ and the UPDATE returns zero rows affected โ the operation retries. This is optimistic locking: no database locks are held during the 2โ5 second payment processing window.
| Phase | Storage | Mechanism | Latency | Failure Recovery |
| Soft Reserve | Redis | Atomic DECR | ~2 ms | INCR to compensate on payment failure |
| Hard Commit | Postgres | Optimistic lock (version check) | ~20 ms | Retry on version conflict |
| Async Notify | Kafka | Event publish | ~5 ms | At-least-once delivery via consumer groups |
Performance Analysis: Handling 500,000 Flash Sale Clicks in 10 Seconds
At 500,000 simultaneous clicks in 10 seconds, the system must process 50,000 checkout requests per second. A Postgres row lock with a 2-second payment window serializes ~500 concurrent updates (1 lock per connection). Redis, by contrast, handles 500,000 atomic DECRs per second on commodity hardware.
The API Gateway's rate limiter is the first defense: it applies a token bucket per SKU, capping checkout requests at 10,000 per second per product. This protects both Redis and Postgres from thundering-herd behavior and returns a 429 "Try Again" response to excess users โ a far better user experience than a 503 server error.
| Checkpoint | Target | Technology |
| Peak order throughput | 10,000 orders/sec | Redis Phase 1 + Kafka dispatch |
| Catalog page latency | < 100 ms | Elasticsearch + Redis cache-aside |
| Inventory reservation latency | < 10 ms | Redis DECR (Phase 1 only) |
| Payment + hard commit latency | < 2,000 ms | Postgres optimistic lock + payment gateway |
๐ Real-World Inventory Systems: Amazon, Shopify, and ASOS
Amazon uses a multi-layer inventory system where each fulfillment center maintains its own local stock count, and a global aggregation layer provides approximate availability for product pages. The final stock deduction happens at fulfillment center selection time โ not at "Add to Cart." This is why Amazon occasionally allows an order that is later listed as "delayed": the global system showed inventory that a local fulfillment center had already exhausted.
Shopify adopted an event-sourced inventory model during its 2021 Black Friday scale-up. Rather than storing "current stock" as a mutable integer, Shopify stores "stock adjustment events" and computes current stock as the event aggregate. This makes the audit trail perfect and enables point-in-time stock reconstruction, but it adds read complexity (aggregating all events) that requires a materialized view for performance.
ASOS routes all flash-sale traffic through a dedicated Flash Sale Service isolated from the main catalog and cart services. This prevents a Black Friday traffic spike from degrading the browse experience for regular shoppers โ a clean example of the Bulkhead pattern applied at the service level.
โ๏ธ Microservice Independence vs. Distributed Transaction Complexity
| Design Decision | Advantage | Risk |
| Redis-first inventory | Extremely fast soft reservation at scale | Redis restart loses in-flight reservations without persistence |
| Optimistic locking in Postgres | No lock contention; high concurrency | High retry rate under extreme write pressure per SKU |
| Kafka for post-order async | Decouples notification/warehouse from checkout | Delayed warehouse dispatch if Kafka consumer lags |
| Microservice separation | Independent scaling per domain | Distributed transactions require Saga or 2PC compensation |
| Elasticsearch for catalog | Sub-100 ms search across 100M products | 1โ2 second indexing lag after product updates |
Critical Failure Mode โ The Compensation Loop: If the payment gateway succeeds but the Postgres hard-commit fails (e.g., database timeout during high load), the Redis counter has already been decremented and the customer charged. The system must execute an idempotent compensation job: detect "PAYMENT_CAPTURED / DB_WRITE_FAILED" state, retry the Postgres write with the same idempotency key, or issue a refund if retries are exhausted. Without this compensation loop, the company has charged a customer with no order record โ a financial and reputational liability.
๐งญ Choosing the Right Inventory Consistency Model for Your Scale
Use the Two-Phase Reservation when:
- Flash sale or limited-quantity products exist where overselling is a significant business risk.
- Any SKU where a single unit has material monetary value (electronics, limited-edition items).
- The "check-and-reserve" window spans multiple seconds due to payment processing.
Simpler inventory patterns are sufficient when:
- Products have thousands of units and a small overcount is commercially acceptable.
- B2B systems where the buyer confirms quantity through a purchase order before payment.
- Digital goods with unlimited inventory: software licenses, streaming access, SaaS seats.
Scaling the Order Service beyond a single region:
- Assign each SKU to a primary region based on geographic demand concentration.
- Route all reservation requests for that SKU to its primary region.
- Use a cross-region Kafka replication topic for inventory reconciliation and audit.
๐งช Delivering This Design in a System Design Interview
Act 1 โ Frame the Overselling Trap (2 minutes): Open with the Prime Day scenario. Draw two concurrent SELECT stock + UPDATE stock pairs racing against the same inventory row. Show how both transactions read "1 unit available" and both successfully decrement, resulting in stock at -1 with two confirmed orders. This immediately demonstrates you understand the core race condition.
Act 2 โ The Three-Path Architecture (5 minutes): Draw the Discovery, Transaction, and Fulfillment separation. Explain why catalog reads and order writes must be on entirely separate data paths. Walk through the Two-Phase Reservation: Redis DECR โ Payment Gateway โ Postgres optimistic lock version check.
Act 3 โ Failure Scenarios (3 minutes): When the interviewer asks "What if Redis crashes?", answer: Phase 1 is a soft gate. If Redis is unavailable, fall back to Postgres-only mode with strict rate limiting rather than going fully down. The trade-off is lower throughput, not data loss.
| Interviewer Question | Strong Answer |
| How do you handle cart abandonment? | Redis cart key expires after 30 minutes; no inventory decrement until checkout begins |
| How do you prevent one bad seller from crashing the platform? | Bulkhead: Flash Sale Service is isolated from catalog; Kafka decouples fulfillment |
| How would you add a recommendations engine? | Read from the order history Kafka topic; never add latency to the checkout critical path |
๐ ๏ธ Open Source Building Blocks for E-Commerce Scale
Apache Kafka is the standard for the Fulfillment Async path. Its durable, partitioned log makes it ideal for the high-fan-out order event stream (notification, analytics, warehouse). Kafka's consumer group model enables independent scaling of each downstream service.
Elasticsearch powers the Catalog Search service. Its inverted index and geo-point mapping handle product search across 100M SKUs with sub-100 ms latency. The Debezium connector provides CDC-based synchronization from Postgres to Elasticsearch without application-level dual writes.
Redis is used in three distinct roles in this architecture: the inventory Phase 1 counter (String with DECR), the shopping cart store (Hash per session with TTL), and the catalog cache (String with short TTL for product JSON blobs).
๐ Lessons Learned From Operating E-Commerce Systems at Flash Sale Scale
Lesson 1 โ The cart is not the inventory. Never decrement real inventory when an item is added to cart. Only decrement during the checkout-to-payment flow. Cart abandonment rates of 70โ80% make pre-reservation at cart time completely unworkable.
Lesson 2 โ Rate limiting at the SKU level is critical. Global rate limiting protects infrastructure. SKU-level rate limiting protects product fairness during flash sales and prevents single-product thundering herds from impacting all other products.
Lesson 3 โ Payment idempotency prevents double charges. Every payment request must include an idempotency key tied to the order ID. The payment gateway must return the same result for repeated requests with the same key during network retries, eliminating the risk of double charges.
Lesson 4 โ Monitor the reservation leak metric. Track the count of orders where Phase 1 succeeded but Phase 2 failed. A rising leak rate is an early warning of Postgres write pressure, network instability between services, or payment gateway latency spikes.
๐ TLDR & Key Takeaways for E-Commerce Platform Design
- Core challenge: Prevent overselling during flash sales while maintaining sub-100 ms catalog response times.
- The solution: Two-Phase Reservation โ atomic Redis DECR for soft reservation, optimistic Postgres lock for hard commitment after payment success.
- Architecture: Three separated paths โ Discovery (Elasticsearch + Redis), Transaction (Order + Inventory + Payment services with ACID writes), Fulfillment (Kafka async fan-out).
- Critical failure mode: The Compensation Loop โ detect and resolve payment-captured-but-db-write-failed states with idempotent retry jobs.
- Key trade-off: Redis speed vs. durability; always pair soft reservations with a compensation strategy for Phase 2 failures.
- At scale: SKU-level rate limiting at the API Gateway is as important as the inventory mechanism itself.
๐ Related Posts
- System Design HLD: Payment Processing โ The complete architecture of the payment gateway integration that powers the Phase 2 hard commitment in e-commerce.
- System Design HLD: Notification Service โ How to design the Kafka-driven post-order email and SMS delivery layer that sits in the Fulfillment Async path.
- System Design HLD: Rate Limiter โ A deep dive into the token bucket and sliding window algorithms used at the API Gateway to protect the inventory service during flash sales.
Test Your Knowledge
Ready to test what you just learned?
AI will generate 4 questions based on this article's content.

Written by
Abstract Algorithms
@abstractalgorithms
More Posts
RAG vs Fine-Tuning: When to Use Each (and When to Combine Them)
TLDR: RAG gives LLMs access to current knowledge at inference time; fine-tuning changes how they reason and write. Use RAG when your data changes. Use fine-tuning when you need consistent style, tone, or domain reasoning. Use both for production assi...
Fine-Tuning LLMs with LoRA and QLoRA: A Practical Deep-Dive
TLDR: LoRA freezes the base model and trains two tiny matrices per layer โ 0.1 % of parameters, 70 % less GPU memory, near-identical quality. QLoRA adds 4-bit NF4 quantization of the frozen base, enabling 70B fine-tuning on 2ร A100 80 GB instead of 8...
Build vs Buy: Deploying Your Own LLM vs Using ChatGPT, Gemini, and Claude APIs
TLDR: Use the API until you hit $10K/month or a hard data privacy requirement. Then add a semantic cache. Then evaluate hybrid routing. Self-hosting full model serving is only cost-effective at > 50M tokens/day with a dedicated MLOps team. The build ...
Watermarking and Late Data Handling in Spark Structured Streaming
TLDR: A watermark tells Spark Structured Streaming: "I will accept events up to N minutes late, and then I am done waiting." Spark tracks the maximum event time seen per partition, takes the global minimum across all partitions, subtracts the thresho...
