Home/Blog/Java/Advanced Graph Algorithms: Dijkstra, Topological Sort, and Disjoint Set Union (DSU)
JavaIntermediateβ€’12 min readβ€’

Advanced Graph Algorithms: Dijkstra, Topological Sort, and Disjoint Set Union (DSU)

Master shortest paths, dependency resolution, and cycle detection with robust Java solutions.

Abstract Algorithms

Abstract Algorithms

Helping engineers master software engineering topics.

TLDR: Graph algorithms are the backbone of networking, scheduling, and connection analysis. This guide explores Dijkstra's pathfinding, Topological Sorting for dependency resolution, and Disjoint Set Union (DSU) for cycle detection, complete with runnable Java code.


πŸ“– The Network Connectivity Problem: Navigating Relationships

Imagine you are designing the routing engine for a ride-sharing service. You need to calculate the fastest path from a driver's current position to a customer. The city is modeled as a network of intersections (nodes) connected by roads (edges), where each road carries a travel time cost (edge weight).

RouteSegment CostAccumulated Cost
Path A (Direct Highway)Node 1 ──▢ Node 3Cost: 10 mins
Path B (Side Streets)Node 1 ──▢ Node 2 ──▢ Node 3Cost: 2 mins + 3 mins = 5 mins

If you use a simple Breadth-First Search (BFS), the algorithm counts edges (hops) rather than edge weights. BFS selects Path A because it is only 1 hop, sending the driver onto a congested highway that takes 10 minutes.

However, Path B takes only 5 minutes total, despite requiring 2 hops. BFS fails when edge weights are non-uniform. To find the true minimum cost path, we must apply weighted pathfinding algorithms like Dijkstra's Algorithm.

Similarly, when resolving build order dependencies in large codebases (e.g., Maven, Gradle) or tracking merged social networks, simple traversals fail. We need advanced graph algorithms: Topological Sorting to resolve order, and Disjoint Set Union (DSU) to manage dynamic groupings.


πŸ” Core Intuitions: Graphs, Edges, and Paths

A graph $G = (V, E)$ consists of a set of vertices $V$ and a set of edges $E$. Edges can be:

  • Directed vs. Undirected: Directed edges specify a one-way path ($A \to B$); undirected edges allow bidirectional travel ($A \leftrightarrow B$).
  • Weighted vs. Unweighted: Weighted edges carry a cost; unweighted edges represent simple connectivity.
  • Cyclic vs. Acyclic: Cyclic graphs contain paths that loop back to starting nodes; acyclic graphs (like trees or DAGs) contain no loops.

Managing these structures during interviews requires selecting the right representation (Adjacency List vs. Adjacency Matrix) and selecting the correct algorithmic engine.


βš™οΈ How Graph Algorithms Traverse States: Breadth, Depth, and Priority

Graph algorithms explore the state space using different container strategies:

  1. Queue (BFS / Kahn's): Explores neighbors level-by-level. This is optimal for unweighted shortest paths and topological ordering.
  2. Stack / Recursion (DFS): Explores as deep as possible before backtracking. This is ideal for cycle detection and connectivity audits.
  3. Priority Queue (Dijkstra): Greedy exploration of nodes based on cumulative cost. Always processes the lowest-cost reachable state next.

The table below summarizes the operational characteristics of our three target algorithms.

AlgorithmPrimary Use CaseTraversal ParadigmEdge Constraints
Dijkstra'sShortest path in weighted graphsGreedy (Min-Heap based Priority Queue)Edge weights must be non-negative
Topological SortDependency resolution (DAGs)BFS (Kahn's) or DFS post-orderGraph must be a DAG (no cycles)
Disjoint Set (DSU)Dynamic connectivity / Cycle detectionUnion-by-Rank & Path CompressionUndirected graphs

🧠 Deep Dive: Complexities and Dynamic Data Representations

Selecting the correct data structures directly dictates the time and space complexity of graph algorithms.

The Internals of Adjacency Lists and Priority Queues

For sparse graphs (where the number of edges $|E|$ is much smaller than $|V|^2$), representing the graph as an Adjacency List is highly efficient. In Java, this is typically represented as a list of lists: List<List<Edge>>.

Dijkstra's algorithm relies on a Priority Queue (Min-Heap) to extract the minimum distance node.

  • Standard Min-Heap lookup: Extracting the minimum element takes $O(\log V)$ time.
  • Relaxation: Updating distances takes $O(\log V)$ time.

Without a Min-Heap, finding the next closest node requires scanning the entire distance array, raising Dijkstra's complexity to $O(V^2)$. With a binary heap priority queue, the time complexity is optimized to $O((V + E) \log V)$.

Performance Analysis of Graph Traversal and Pathfinding

For DSU, naive union and find operations can degrade to $O(V)$ time if the tree structures become unbalanced (skewed lines). To solve this, we apply two key optimizations:

  1. Path Compression: During the find operation, we update each visited node to point directly to the root parent. This flattens the tree.
  2. Union by Rank: We always attach the shorter tree under the root of the taller tree.

Combining path compression and union by rank guarantees that find and union operations run in near-constant time: $O(\alpha(V))$, where $\alpha$ is the Inverse Ackermann function. For all practical values of $V$ in computer science, $\alpha(V) \le 4$, making DSU operations extremely fast.


πŸ“Š Visualizing Shortest Path Finding: Dijkstra's Step-by-Step State

Let us trace Dijkstra's algorithm finding the shortest path from Node 0 to Node 2 in this weighted graph:

  • Edge $0 \to 1$ (weight 4)
  • Edge $0 \to 2$ (weight 9)
  • Edge $1 \to 2$ (weight 3)

πŸ“Š Dijkstra Pathfinding Process

flowchart LR
    Node0((Node 0)) -->|Wt: 4| Node1((Node 1))
    Node0 -->|Wt: 9| Node2((Node 2))
    Node1 -->|Wt: 3| Node2
    style Node0 fill:#e8f4e8,stroke:#28a745
    style Node2 fill:#fff3cd,stroke:#ffc107

Let's trace the state table showing the evolution of the distance array:

StepCurrent NodeHeap Contents (Pair: Distance, Node)Distance Array dist
0 (Init)-(0, 0)[0, ∞, ∞]
1Node 0(4, 1), (9, 2)[0, 4, 9]
2Node 1(7, 2), (9, 2)[0, 4, 7] (Relaxed Node 2 via 1)
3Node 2(9, 2) (Discarded - larger distance)[0, 4, 7] (Final Path Found)

This step-by-step trace shows how the path through Node 1 is preferred. In Step 2, visiting Node 1 allows us to relax the distance to Node 2 from 9 down to 7 (since $4 + 3 < 9$). The heap extracts (7, 2) next, successfully finding the shortest path.


🌍 Real-World Applications: GPS Navigation, Build Systems, and Network Protocols

Graph algorithms power many standard applications:

  • Google & Apple Maps: Use Dijkstra variants (like A* search with heuristics and contraction hierarchies) to calculate traffic-aware driving routes in milliseconds.
  • Build Engines (Webpack, Maven): Webpack parses dependencies between JavaScript files, generating a directed graph. It uses Topological Sorting to decide the exact compilation sequence, ensuring libraries compile before the files that import them.
  • Network Routing (OSPF): The Open Shortest Path First protocol uses Dijkstra's algorithm to calculate optimal data packet routing tables across global routers.

βš–οΈ Trade-offs and Failure Modes: Infinite Loops, Stack Overflows, and Negative Cycles

When working with graphs, be aware of these failure modes:

  • Negative Cycles (Dijkstra): If a graph contains a negative edge weight cycle (e.g., $A \to B \to A$ with a net negative cost), Dijkstra's algorithm will fall into an infinite loop, continuously trying to lower distances. You must use the Bellman-Ford algorithm if negative edges are present.
  • Unbounded Recursion (DFS): Running recursive DFS on deep, linear graphs (like a linked list graph of 100,000 nodes) will throw a StackOverflowError in Java. Use iterative traversal with custom stacks for large graphs.

🧭 Decision Guide: DFS vs BFS vs Dijkstra vs DSU

Use this guide to map your graph problem to the correct algorithmic strategy.

SituationRecommendationAlternative
Shortest path in weighted graph with positive weightsDijkstra's Algorithm$O((V+E)\log V)$ time with Min-Heap.
Shortest path in unweighted graphBreadth-First Search (BFS)Fast $O(V + E)$ time; avoids heap overhead.
Resolving dependencies or compilation orderTopological Sort (Kahn's)Confirms graph is acyclic and outputs order.
Dynamic connectivity (merging sets, cycle checking)Disjoint Set Union (DSU)Near $O(1)$ operations with path compression.

πŸ§ͺ Implementing Graph Patterns: Three Classic Interview Solutions

Let's implement Dijkstra's shortest path, Kahn's Topological Sort, and DSU Cycle Detection in Java.

1. Dijkstra's Shortest Path

  • Problem: Find the shortest path from a source node to all other nodes in a weighted graph.
  • Implementation: Using adjacency lists and a Java PriorityQueue.
import java.util.*;

public class DijkstraSolver {
    public static class Edge {
        int target, weight;
        public Edge(int target, int weight) {
            this.target = target;
            this.weight = weight;
        }
    }

    public static class NodePair implements Comparable<NodePair> {
        int node, distance;
        public NodePair(int node, int distance) {
            this.node = node;
            this.distance = distance;
        }
        @Override
        public int compareTo(NodePair other) {
            return Integer.compare(this.distance, other.distance);
        }
    }

    public int[] findShortestPaths(List<List<Edge>> graph, int source, int n) {
        int[] dist = new int[n];
        Arrays.fill(dist, Integer.MAX_VALUE);
        dist[source] = 0;

        PriorityQueue<NodePair> minHeap = new PriorityQueue<>();
        minHeap.offer(new NodePair(source, 0));

        while (!minHeap.isEmpty()) {
            NodePair current = minHeap.poll();
            int u = current.node;

            // Skip if we've already found a shorter path to this node
            if (current.distance > dist[u]) continue;

            for (Edge edge : graph.get(u)) {
                int v = edge.target;
                int weight = edge.weight;

                if (dist[u] != Integer.MAX_VALUE && dist[u] + weight < dist[v]) {
                    dist[v] = dist[u] + weight;
                    minHeap.offer(new NodePair(v, dist[v]));
                }
            }
        }
        return dist;
    }
}

2. Topological Sort (Kahn's BFS-based Algorithm)

  • Problem: Find a linear ordering of vertices such that for every directed edge $u \to v$, vertex $u$ comes before $v$.
  • Implementation: Using in-degree array and a standard BFS Queue.
import java.util.*;

public class TopologicalSorter {
    public int[] sort(List<List<Integer>> graph, int n) {
        int[] inDegree = new int[n];
        for (int u = 0; u < n; u++) {
            for (int v : graph.get(u)) {
                inDegree[v]++;
            }
        }

        Queue<Integer> queue = new LinkedList<>();
        for (int i = 0; i < n; i++) {
            if (inDegree[i] == 0) {
                queue.offer(i);
            }
        }

        int[] order = new int[n];
        int idx = 0;

        while (!queue.isEmpty()) {
            int u = queue.poll();
            order[idx++] = u;

            for (int v : graph.get(u)) {
                inDegree[v]--;
                if (inDegree[v] == 0) {
                    queue.offer(v);
                }
            }
        }

        // If index doesn't reach N, the graph contains a cycle (cannot sort)
        if (idx != n) {
            return new int[0];
        }
        return order;
    }
}

3. Disjoint Set Union (DSU) for Cycle Detection

  • Problem: Detect if an undirected graph contains a cycle.
  • Implementation: Using path compression and union-by-rank.
public class DisjointSetUnion {
    private final int[] parent;
    private final int[] rank;

    public DisjointSetUnion(int size) {
        parent = new int[size];
        rank = new int[size];
        for (int i = 0; i < size; i++) {
            parent[i] = i; // Every node is its own parent initially
            rank[i] = 0;
        }
    }

    // Find with Path Compression
    public int find(int i) {
        if (parent[i] == i) {
            return i;
        }
        parent[i] = find(parent[i]); // Path compression step
        return parent[i];
    }

    // Union by Rank. Returns false if union is impossible (nodes already connected)
    public boolean union(int i, int j) {
        int rootI = find(i);
        int rootJ = find(j);

        if (rootI == rootJ) {
            return false; // Already in the same set (cycle detected if adding an edge here)
        }

        if (rank[rootI] < rank[rootJ]) {
            parent[rootI] = rootJ;
        } else if (rank[rootI] > rank[rootJ]) {
            parent[rootJ] = rootI;
        } else {
            parent[rootJ] = rootI;
            rank[rootI]++;
        }
        return true;
    }
}

πŸ› οΈ Java's PriorityQueue: The Core Engine for Shortest Paths

Java's standard library provides java.util.PriorityQueue, which is implemented as a balanced binary heap. It guarantees $O(\log N)$ time for enqueueing (offer()) and dequeueing (poll()) operations.

When implementing Dijkstra's algorithm, always ensure your custom node wrapper implements the Comparable interface (as shown in our code block) or pass a custom Comparator during initialization:

// Alternate initialization using lambda comparator
PriorityQueue<NodePair> queue = new PriorityQueue<>((a, b) -> Integer.compare(a.distance, b.distance));

This prevents class cast errors and guarantees that the queue processes the minimum distance node first, ensuring correct greedy execution.

For a full deep-dive on PriorityQueue heap mechanics, see our planned follow-up post.


πŸ“š Lessons Learned: Common Graph Coding Mistakes to Avoid

Avoid these standard bugs when coding graph algorithms during interviews:

  1. Forgetting Visited Check in Dijkstra: In Dijkstra, always check if the current polled distance is larger than the recorded distance (current.distance > dist[u]). Failing to skip these stale queue entries will result in redundant edge scans, degrading performance to $O(V \cdot E)$.
  2. Infinite Loops on Cyclic Graphs: When writing DFS or BFS, make sure you track visited nodes using a boolean[] visited or Set<Integer> visited. Failing to do so on cyclic graphs will lock the program in an infinite loop.
  3. Modifying In-Degree Array in Kahn's Sort: In Topological Sort, ensure you decrement in-degrees relative to the actual graph nodes, not indices. Always use the original adjacency list to lookup neighbors.

πŸ“Œ Summary: The Graph Algorithm Rulebook

  • Weighted Paths: Use Dijkstra's algorithm with a Min-Heap.
  • Dependency Resolution: Use Topological Sorting (Kahn's algorithm) on directed acyclic graphs (DAGs).
  • Dynamic Connectivity: Use DSU with Path Compression and Union by Rank for optimal $O(\alpha(N))$ time.
  • Loop Prevention: Always track visited nodes in cyclic graphs to avoid infinite execution paths.
  • Stale Heap Checks: Skip stale priority queue nodes to keep Dijkstra's time complexity at $O((V+E)\log V)$.

AI-generated article quiz

Test your understanding

🧠

Ready to test what you just learned?

Generate four focused questions from this article. Answers include immediate explanations.

Guided series path

Data Structures and Algorithms

View all lessons β†’
Lesson 27 of 27

Reader feedback

Was this article useful?

Rate it if it helped, then continue with the next deep dive when you are ready.

Sign in to save your rating.