LLD for LRU Cache: Designing a High-Performance Cache
How do you design a cache that automatically removes the least used items? We break down the Low-...
Abstract AlgorithmsTLDR: An LRU (Least Recently Used) Cache evicts the item that hasn't been accessed the longest when it's full. The classic implementation combines a HashMap (O(1) lookup) with a Doubly Linked List (O(1) move-to-front) for overall O(1) get and put.
π The Browser History Analogy
Your browser "Back" button stores the pages you visited most recently. If you visit 100 pages and your browser only keeps 10 in memory, it drops the oldest one.
LRU says: the item you haven't needed in the longest time is the least valuable β evict it first.
The classic challenge is implementing this in O(1) for both get and put. A hash map gives O(1) lookup but can't maintain order. A linked list maintains order but gives O(n) lookup. The insight: use both together.
π’ The HashMap + Doubly Linked List Architecture
flowchart LR
HashMap["HashMap\n(key β node pointer)"] --> DLL["Doubly Linked List\nHead (MRU) ββ ... ββ Tail (LRU)"]
- HashMap: Maps each key to its
Nodein the list β O(1) access. - Doubly Linked List: Maintains access order. Head = Most Recently Used, Tail = Least Recently Used.
- On
get(key): find the node (O(1) via map), move it to head (O(1) via pointer manipulation). - On
put(key, val)when full: remove tail node (O(1)), delete its key from map, insert new node at head.
βοΈ Step-by-Step Trace: Capacity = 3
| Operation | List (Head β Tail) | HashMap | Note |
put(1, A) | [1] | {1βNode1} | First entry |
put(2, B) | [2, 1] | {1,2} | 2 is MRU |
put(3, C) | [3, 2, 1] | {1,2,3} | Full |
get(1) | [1, 3, 2] | {1,2,3} | 1 accessed β moves to head |
put(4, D) | [4, 1, 3] | {1,3,4} | 2 (tail) evicted; 4 inserted at head |
Key: get(1) bumped 1 to head, so 3 became the new tail. When 4 arrives, 2 gets evicted β not 3.
π§ Java Implementation
import java.util.HashMap;
public class LRUCache {
private static class Node {
int key, val;
Node prev, next;
Node(int k, int v) { key = k; val = v; }
}
private final int capacity;
private final HashMap<Integer, Node> map = new HashMap<>();
// Sentinel nodes simplify boundary conditions
private final Node head = new Node(0, 0); // Most Recent side
private final Node tail = new Node(0, 0); // Least Recent side
public LRUCache(int capacity) {
this.capacity = capacity;
head.next = tail;
tail.prev = head;
}
public int get(int key) {
if (!map.containsKey(key)) return -1;
Node n = map.get(key);
moveToHead(n);
return n.val;
}
public void put(int key, int val) {
if (map.containsKey(key)) {
Node n = map.get(key);
n.val = val;
moveToHead(n);
} else {
Node n = new Node(key, val);
map.put(key, n);
insertAtHead(n);
if (map.size() > capacity) {
Node lru = removeTail();
map.remove(lru.key);
}
}
}
private void moveToHead(Node n) {
removeNode(n);
insertAtHead(n);
}
private void removeNode(Node n) {
n.prev.next = n.next;
n.next.prev = n.prev;
}
private void insertAtHead(Node n) {
n.next = head.next;
n.prev = head;
head.next.prev = n;
head.next = n;
}
private Node removeTail() {
Node lru = tail.prev;
removeNode(lru);
return lru;
}
}
Sentinel nodes (head and tail) eliminate null checks at list boundaries β every real node always has a prev and next.
βοΈ Thread Safety: From Single-Threaded to Concurrent
The basic implementation above is not thread-safe. For multi-threaded cache (e.g., shared across API threads):
| Approach | How | Trade-off |
synchronized on all methods | synchronized get(), synchronized put() | Simple; locks entire cache per op β fine for low contention |
ReentrantReadWriteLock | Read lock on get, write lock on put | Higher throughput for read-heavy workloads |
LinkedHashMap(capacity, 0.75f, true) + synchronized | Java built-in access-order LRU | Simple but coarse lock |
| Segmented locks (ConcurrentHashMap + partitioned DLL) | Shard by key hash | Best throughput; complex implementation |
For production caches, most teams use a battle-tested library like Caffeine (Java) or functools.lru_cache (Python) rather than rolling their own.
π Summary
- LRU evicts the item not accessed the longest β ideal when recency correlates with future access probability.
- HashMap + Doubly Linked List gives O(1) get and put.
- Sentinel head/tail nodes simplify insertion and removal at boundaries.
- For thread safety,
synchronizedmethods are a good starting point; Caffeine or Guava Cache for production.
π Practice Quiz
What two data structures combine to give LRU Cache O(1) get and O(1) put?
- A) Array + Binary Heap
- B) HashMap + Doubly Linked List
- C) TreeMap + Queue
Answer: B
When
get(key)is called on a key that exists, what must happen besides returning the value?- A) Increment a hit counter.
- B) Move the accessed node to the head of the list (mark as Most Recently Used).
- C) Check if the cache is full and possibly evict.
Answer: B
Why are sentinel head and tail nodes used?
- A) They store the MRU and LRU values as a shortcut.
- B) They eliminate null-pointer checks at list boundaries β every real node always has a non-null prev and next.
- C) They act as locks for thread safety.
Answer: B

Written by
Abstract Algorithms
@abstractalgorithms
More Posts
SFT for LLMs: A Practical Guide to Supervised Fine-Tuning
TLDR: Supervised fine-tuning (SFT) is the stage where a pretrained model learns task-specific response behavior from curated input-output examples. It is usually the first alignment step after pretraining and often the foundation for later RLHF. Good...
RLHF in Practice: From Human Preferences to Better LLM Policies
TLDR: Reinforcement Learning from Human Feedback (RLHF) helps align language models with human preferences after pretraining and SFT. The typical pipeline is: collect preference comparisons, train a reward model, then optimize a policy (often with KL...
PEFT, LoRA, and QLoRA: A Practical Guide to Efficient LLM Fine-Tuning
TLDR: Full fine-tuning updates every model weight, which is expensive in memory, compute, and storage. PEFT methods update only a small trainable slice. LoRA learns low-rank adapters on top of frozen base weights. QLoRA pushes efficiency further by q...
LLM Model Naming Conventions: How to Read Names and Why They Matter
TLDR: LLM names encode practical decisions: model family, size, training stage, context window, format, and quantization level. If you can decode naming conventions, you can avoid costly deployment mistakes and choose the right checkpoint faster. οΏ½...
