System Design HLD Example: Collaborative Document Editing (Google Docs)
A senior-level HLD for real-time collaborative editing with Operational Transformation (OT) and conflict resolution.
Abstract AlgorithmsTLDR: Real-time collaborative editing relies on Operational Transformation (OT) or CRDTs to resolve concurrent edits without data loss. The core trade-off is Latency vs. Consistency: we use optimistic local updates for zero-latency typing and a centralized OT server to ensure all clients eventually converge to the exact same document state.
✍️ The "Hello World" Race Condition
Imagine two users, Alice and Bob, are editing the same sentence: "The cat is happy." Alice wants to change it to "The fat cat is happy," so she inserts "fat " at position 4. Simultaneously, Bob wants to change it to "The cat is very happy," so he inserts "very " at position 11.
In a naive system:
- Alice's Edit: Server applies
INSERT("fat ", pos:4). Doc becomes: "The fat cat is happy." - Bob's Edit: Server applies Bob's original
INSERT("very ", pos:11). - The Result: "The fat cat is very hvery appy."
Wait, what happened? Because Alice's insert shifted the entire string by 4 characters, Bob's original position 11 is no longer the space before "happy"—it's now in the middle of the word "happy." This is the Position Shift Problem. In a collaborative environment, every edit changes the coordinate system for every other concurrent edit. Without a specialized resolution algorithm, the document quickly becomes a corrupted mess of overlapping characters.
📖 Real-Time Collaboration: Use Cases & Requirements
Actors
- Author / Editor: Creates and modifies document content in real-time.
- Collaborator: Views live changes and contributes simultaneously.
- Viewer: Consumes the document without editing rights but still expects live updates.
Functional Requirements
- Real-Time Editing: Multiple users can edit the same document simultaneously.
- Conflict Resolution: All users must eventually see the exact same document state (Convergence).
- Presence: Show which users are currently active and their cursor positions.
- Version History: Ability to see past revisions and restore previous states.
- Offline Support: (Partial) Allow users to continue typing during brief network drops.
Non-Functional Requirements
- Ultra-Low Latency: Keystrokes should appear on other screens in < 100ms.
- High Availability: The service must be available 99.99% of the time.
- Scalability: Support 10,000+ concurrent editors on a single "hot" document (e.g., a viral public doc).
- Durability: No committed edit should ever be lost.
🔍 Basics: Operational Transformation (OT)
Operational Transformation is the "classic" approach used by Google Docs. The core idea is that the server doesn't just apply an operation; it transforms it against all other operations that happened since the client's last known version.
Every operation is represented as a triple: (Type, Position, Value).
When the server receives an operation based on version V, but the current server version is V+k, it runs the new operation through a transform function T(op_new, op_concurrent) for all $k$ operations. This ensures that the user's intent is preserved even if the index has changed.
⚙️ Mechanics: The Transform Matrix
To implement OT, you must define how every operation type interacts with every other operation type.
- Insert vs. Insert: If User A inserts at position 5 and User B inserts at position 10, no change is needed for A, but B's position must be incremented by the length of A's insert.
- Insert vs. Delete: If a character is deleted before an insert, the insert position must be decremented.
- Delete vs. Delete: If both users delete the same character, the second operation becomes a "no-op" to avoid double-deletion.
This "Matrix" of rules ensures that no matter what order operations arrive in, the final result is deterministic.
📐 Estimations & Design Goals
The Math of Real-Time Sync
- Average Typist: 60 words per minute $\approx 5$ characters per second.
- Active Users per Doc: 10 (average) to 1,000 (hot docs).
- Traffic per Doc: 10 users 5 chars/sec = *50 operations/sec.
- Fan-out: Every 1 operation must be broadcast to the other 9 users.
- Total Broadcasts: $50 \times 9 = 450$ messages per second per document.
Design Goal: Use WebSockets for bidirectional, low-latency communication. Standard HTTP polling would be too slow and would overwhelm the server with headers.
📊 High-Level Design: The Centralized OT Architecture
The following architecture utilizes a centralized "Orderer" to ensure all operations are sequenced correctly.
graph TD
UserA((User A)) <--> WSS1[WS Server Shard 1]
UserB((User B)) <--> WSS1
UserC((User C)) <--> WSS2[WS Server Shard 2]
WSS1 <--> LB[Internal Load Balancer]
WSS2 <--> LB
LB <--> OTS[OT Service: The Arbiter]
OTS --> PDB[(Document Store: Postgres)]
OTS --> RC[(Active Doc Cache: Redis)]
OTS --> PubSub[Redis Pub/Sub]
PubSub --> WSS1
PubSub --> WSS2
Explanation of the Architecture: The system relies on persistent WebSocket connections to minimize latency. Clients send operations to a WebSocket Server, which forwards them to a centralized OT Service. This service is the "Single Source of Truth"—it assigns a version number to every edit and performs the transformation against concurrent edits. Once an edit is committed to the PostgreSQL log, it is broadcast back out via Redis Pub/Sub to all WebSocket shards, which then push the update to the connected clients.
🔌 API Design: The Contract
Since this is a real-time system, the "API" is mostly a WebSocket message protocol.
| Message Type | Direction | Payload | Description |
DOC_OPEN | Client -> Server | { doc_id, last_version } | Initiates a session and requests catch-up edits. |
OP_SUBMIT | Client -> Server | { type, pos, val, base_version } | Submits a new local edit. |
OP_BROADCAST | Server -> Client | { type, pos, val, new_version, user_id } | Broadcasts a confirmed edit to all peers. |
PRESENCE | Client -> Server | { cursor_pos, selection_range } | Updates the user's visual state for others. |
🗄️ Data Model: Schema Definitions
Operations Log (PostgreSQL)
We store every single edit as an immutable row. This is the Event Sourcing pattern.
| Table | Column | Type | Notes |
operations | doc_id | UUID | Composite PK with version. |
operations | version | INT | Monotonically increasing per document. |
operations | type | ENUM | insert, delete, retain. |
operations | pos | INT | The transformed position. |
operations | value | TEXT | The character(s) added. |
operations | author_id | UUID | For attribution. |
Document Snapshots (PostgreSQL)
Replaying 1 million operations to open a doc is slow. We take snapshots every 100 edits.
| Table | Column | Type | Notes |
snapshots | doc_id | UUID | PK. |
snapshots | version | INT | The version this snapshot represents. |
snapshots | content | TEXT | The full serialized document state. |
🧭 Tech Stack & Design Choices
| Component | Choice | Rationale |
| Communication | WebSockets (Socket.io) | Low-latency bidirectional sync is mandatory for "live" feel. |
| OT Engine | Node.js | Excellent for high-concurrency I/O and shared JS logic between client/server. |
| Main Database | PostgreSQL | Strong ACID for the operation log; JSONB support for complex doc structures. |
| Pub/Sub | Redis | Lightning-fast message distribution to WebSocket shards. |
| Cache | Redis | Stores active document state and "Presence" (cursor) data with TTL. |
🧠 Design Deep Dive
🛡️ Internals: Client-Side Optimistic Updates
To make the editor feel "instant," the client doesn't wait for the server.
- Apply Locally: User types 'A'. The client updates the UI immediately.
- Buffer: The client puts this 'A' into a
pending_buffer. - Wait for ACK: The client sends the op to the server. While waiting for the server's ACK, if another user's edit arrives, the client transforms its own
pending_bufferagainst the incoming edit to keep the UI consistent.
📊 Performance Analysis: The "Viral Doc" Problem
What happens if 5,000 people open the same document?
- The Bottleneck: The single OT Service instance for that doc ID. OT must be processed sequentially per document to ensure consistency.
- Optimization: Document Sharding. We ensure all editors for
doc_Aare routed toServer_1, whiledoc_Bgoes toServer_2. - Fan-out Load: If 5,000 users type simultaneously, the Redis Pub/Sub will be flooded. We use Rate Limiting on the presence/cursor updates (e.g., send cursor moves only every 50ms) while keeping text edits high-priority.
🌍 Real-World Applications
Collaborative architectures power more than just text editors:
- Design Tools: Figma (uses a CRDT-inspired variant).
- Whiteboards: Miro or LucidChart.
- Coding: VS Code Live Share.
- Project Management: Trello or Monday.com live updates.
⚖️ Trade-offs & Failure Modes
- OT vs. CRDT:
- OT (Used here): Requires a central server. Simpler for text, better for conflict resolution in complex formatting.
- CRDT (Conflict-free Replicated Data Types): Can work peer-to-peer (no server). High memory overhead because it must store metadata for every character ever deleted.
- Snapshot Frequency: Snapshotting every 10 edits saves time but costs storage. Snapshotting every 1,000 edits saves storage but makes "doc open" slow. We chose 100 as a middle ground.
- Failure Mode: Network Partition. If a client loses internet, they continue to buffer edits. Upon reconnect, the server must perform a "Bulk Transform" of all buffered edits. If the gap is too large (e.g., 2 hours), we force a document refresh to avoid corruption.
🏗️ Advanced Concepts for Production
- Delta Compression: Instead of sending every character, we group characters typed within a short window (e.g., 100ms) into a single
insertoperation to reduce WebSocket overhead. - Operational Metadata: Moving beyond text—how do you "OT" a bold tag? We represent formatting as "Attributes" on a range of text, and transforms now include range arithmetic.
- Presence via Redis TTL: Cursor positions are volatile. We store them in Redis with a 5-second TTL. If a user closes their laptop, their cursor disappears automatically as the key expires.
🧭 Decision Guide
| Metric | OT (Operational Transformation) | CRDT (Conflict-free Types) |
| Central Server | Required (Arbiter). | Not required (P2P possible). |
| Complexity | High (The Matrix). | Very High (Mathematical data structures). |
| Overhead | Low (Small ops). | High (Metadata for every char). |
| Use Case | Google Docs, Office 365. | Figma, Local-first apps. |
🧪 Practical Example: Interview Delivery
In a 45-minute interview, follow this beat:
- The Conflict (5 min): Explain why Bob's delete fails if Alice inserts first.
- The Sequence (10 min): Explain the Ops Log. Every edit is an event with a version number.
- The scaling (15 min): Explain how Redis Pub/Sub connects multiple WebSocket servers so Alice (on Server 1) can see Bob (on Server 2).
Standard Conclusion: "I chose a centralized OT approach because it provides the most deterministic convergence for rich-text editing. By sharding documents by ID and using Redis for real-time fan-out, we can support high-concurrency documents while maintaining an immutable audit log of every edit in PostgreSQL."
🛠️ ShareDB: Implementing OT in Practice
In a production Node.js environment, we often use ShareDB, a mature OT backend that handles the complex transform logic for us.
// Server-side: Initializing a document session
const ShareDB = require('sharedb');
const backend = new ShareDB();
// When a WebSocket connects...
const connection = backend.connect();
const doc = connection.get('documents', 'doc-123');
doc.fetch(function(err) {
if (err) throw err;
if (doc.type === null) {
// Create the document if it doesn't exist
doc.create({ content: "The cat is happy." }, 'ot-text');
}
});
// Listening for ops from other users
doc.on('op', (op, source) => {
if (source) return; // Ignore local ops
console.log('Remote edit received:', op);
});
This snippet shows how ShareDB abstracts the "Arbiter" logic, allowing developers to focus on the UI while the library handles the position transformations and database persistence.
📚 Lessons Learned
- The Client is a Mini-Server: The client must be smart enough to transform remote ops against its own pending buffer.
- Sequence Numbers are Life: Without a strict version
V -> V+1flow, you cannot guarantee convergence. - Cursors are not Content: Keep cursor/presence data in Redis, not the primary DB. It’s too "noisy" for a persistent log.
📌 Summary
- OT solves the Position Shift Problem.
- WebSockets + Redis Pub/Sub are the foundation for real-time scale.
- Event Sourcing (Ops Log) provides history and durability for free.
- Snapshots keep document loading fast.
📝 Practice Quiz
Why is Operational Transformation (OT) better than Last-Writer-Wins?
- A) It makes the database smaller.
- B) It allows concurrent edits to be merged rather than one being discarded.
- C) It is easier to implement. Correct Answer: B
What is the primary role of the OT Service "Arbiter"?
- A) To encrypt the document.
- B) To assign a unique, sequential version number to every edit and resolve conflicts.
- C) To render the document to a PDF. Correct Answer: B
In the "Alice inserts, Bob deletes" scenario, why must Bob's delete position be transformed?
- A) Because the server clock is wrong.
- B) Because Alice's insert shifted the indices of all subsequent characters.
- C) Because the delete was too slow. Correct Answer: B
[Open-ended] If you were to add "Offline Mode" to this system, how would you handle a user who makes 1,000 edits while on an airplane and then reconnects? What are the risks of using OT for this?
🔗 Related Posts

Written by
Abstract Algorithms
@abstractalgorithms
More Posts
Redis Sorted Sets Explained: Skip Lists, Scores, and Real-World Use Cases
TLDR: Redis Sorted Sets (ZSETs) store unique members each paired with a floating-point score, kept in sorted order at all times. Internally they use a skip list for O(log N) range queries and a hash table for O(1) score lookup — giving you the best o...
Write-Time vs Read-Time Fan-Out: How Social Feeds Scale
TLDR: Fan-out is the act of distributing one post to many followers' feeds. Write-time fan-out (push) pre-computes feeds at post time — fast reads but catastrophic write amplification for celebrities. Read-time fan-out (pull) computes feeds on demand...

Two Pointer Technique: Solving Pair and Partition Problems in O(n)
TLDR: Place one pointer at the start and one at the end of a sorted array. Move them toward each other based on a comparison condition. Every classic pair/partition problem that naively runs in O(n²)

Tries (Prefix Trees): The Data Structure Behind Autocomplete
TLDR: A Trie stores strings character by character in a tree, so every string sharing a common prefix shares those nodes. Insert and search are O(L) where L is the word length. Tries beat HashMaps on
