Skills and frameworks

Union-Find LeetCode Pattern Guide — Disjoint Sets, Connectivity, and Kruskal's

8 min read · April 25, 2026

A practical Union-Find LeetCode pattern guide for disjoint sets, connectivity checks, cycle detection, component counting, and Kruskal's minimum spanning tree problems.

Union-Find LeetCode Pattern Guide — Disjoint Sets, Connectivity, and Kruskal's

This Union-Find LeetCode pattern guide is for problems where the question is not “what is the shortest path?” but “which things belong to the same group?” Disjoint sets, connectivity, and Kruskal's algorithm all use the same core data structure: maintain components while merging pairs and answering whether two nodes share a representative. Once you recognize that a graph problem only needs component membership, Union-Find is usually simpler and faster than repeatedly running DFS or BFS.

Interviewers like Union-Find because the implementation is compact, but the modeling step is subtle. You must decide what the nodes are, what an edge means, and what each successful union changes.

What Union-Find stores

Union-Find, also called Disjoint Set Union or DSU, stores a partition of items into non-overlapping groups. It supports two operations:

find(x): return the representative, or root, of the set containing x.
union(a, b): merge the sets containing a and b; return whether a merge actually happened.

With path compression and union by size or rank, both operations are effectively constant time for interview constraints.

A reliable Python template:

class DSU:
    def __init__(self, n):
        self.parent = list(range(n))
        self.size = [1] * n
        self.components = n

    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])
        return self.parent[x]

    def union(self, a, b):
        ra, rb = self.find(a), self.find(b)
        if ra == rb:
            return False
        if self.size[ra] < self.size[rb]:
            ra, rb = rb, ra
        self.parent[rb] = ra
        self.size[ra] += self.size[rb]
        self.components -= 1
        return True

The union return value is extremely useful. False often means “this edge creates a cycle,” “these accounts were already connected,” or “this equation is already implied.”

When to reach for Union-Find

Use DSU when the problem has one of these shapes:

| Problem signal | DSU interpretation | |---|---| | “Are these nodes connected?” | Compare roots after unions | | “How many connected components?” | Start with n, decrement on successful union | | “Does adding this edge create a cycle?” | Cycle if both endpoints already share a root | | “Merge accounts / emails / synonyms” | Map each identifier to a node and union related identifiers | | “Regions cut by slashes” | Model subcells as nodes and union open edges | | “Minimum spanning tree” | Sort edges and union only cross-component edges | | “Satisfiability of equality equations” | Union equals, then reject unequal pairs with same root |

Do not use DSU when you need an actual path, shortest distance, topological order, or dynamic deletions. Union-Find is excellent at merging; it is weak at un-merging.

Connectivity and component counting

For a basic “number of connected components” problem, initialize each node as its own component. Each successful union reduces the component count by one.

def count_components(n, edges):
    dsu = DSU(n)
    for a, b in edges:
        dsu.union(a, b)
    return dsu.components

The invariant is simple: components equals the number of distinct roots. You do not have to run a final DFS. You also do not have to count unique parents directly unless you have compressed all paths, because some parent entries may point to intermediate nodes.

For “valid tree,” use two facts: a tree with n nodes must have exactly n - 1 edges, and no edge can connect nodes already in the same component.

def valid_tree(n, edges):
    if len(edges) != n - 1:
        return False
    dsu = DSU(n)
    for a, b in edges:
        if not dsu.union(a, b):
            return False
    return True

This avoids overcomplicating the problem. The edge count plus no cycle implies connectedness for an undirected graph.

Kruskal's algorithm: greedy plus DSU

Kruskal's minimum spanning tree algorithm is the most famous Union-Find use case. Sort edges by weight. Add the cheapest edge that connects two different components. Skip edges whose endpoints are already connected.

def kruskal(n, edges):
    # edges are (weight, a, b)
    dsu = DSU(n)
    total = 0
    used = 0

    for w, a, b in sorted(edges):
        if dsu.union(a, b):
            total += w
            used += 1
            if used == n - 1:
                return total
    return None  # graph was disconnected

The proof uses the cut property: for any cut of the graph, the cheapest edge crossing that cut is safe to include in some minimum spanning tree. DSU tells you whether an edge crosses between components or stays inside a component already connected by cheaper edges.

In LeetCode-style problems, Kruskal often appears as “connect all points with minimum cost,” “optimize water distribution,” or “minimum cost to connect cities.” The expensive part may be building edges, not the DSU itself.

Modeling non-obvious nodes

Many DSU problems are hard because the nodes are not given as 0..n-1.

Accounts merge

Each email can be a node. For every account, union the first email with each other email in that account. Then group emails by root and attach the account name.

The trap is that names are not unique identifiers. Two people can share a name, while one person can have multiple emails. Model the stable identifier, not the display field.

Equations possible

For equations like a == b and a != b, first union all equalities. Then scan inequalities and reject any pair with the same root. Order matters because an inequality cannot be evaluated safely until all implied equalities are known.

Grid islands

You can map cell (r, c) to r * cols + c. Union adjacent land cells. This is often simpler for dynamic “number of islands as land is added” because each new land cell starts as one component and unions with existing land neighbors.

idx = r * cols + c

Keep a seen set for active land cells so water nodes do not accidentally count as components.

Path compression and union by size explained

Without optimization, find can degrade into chasing a long chain. Path compression flattens that chain by making each visited node point directly to the root. Union by size attaches the smaller tree under the larger tree. Together, they make DSU fast enough for very large inputs.

In an interview, you can say:

“Path compression rewrites parents during find, and union by size prevents tall trees. Amortized time is inverse Ackermann, which is effectively constant for practical input sizes.”

You usually do not need to derive inverse Ackermann. Knowing why the optimizations work is enough.

Common traps

Counting parents without calling find. Intermediate parent pointers may be stale.
Forgetting to decrement components only on successful union. Duplicate edges should not reduce the count.
Using DSU for directed reachability. DSU models undirected connectivity or equivalence, not direction.
Processing inequalities too early. Union all equalities before checking contradictions.
Assuming display names are identifiers. For account merge, emails are the reliable nodes.
Ignoring 1-indexed input. Many graph problems label cities from 1 to n; convert or size arrays accordingly.
Trying to delete edges. Standard DSU does not support efficient split operations.

Decision rules in interviews

Ask yourself:

Are relationships symmetric? If a connects to b, does b connect to a?
Do I only need group membership, component count, or cycle detection?
Are operations mostly additions, not deletions?
Can each item be mapped to an integer or dictionary key?
Does a failed union mean a cycle or redundancy?

If the answer is yes, DSU is probably a strong candidate.

How to explain Union-Find clearly

A concise interview explanation:

“I’ll represent each connected component by a root. find returns the root with path compression. union merges two roots and returns false if they were already connected. That gives us component count and cycle detection in near-constant time per edge.”

For Kruskal:

“I’ll sort edges by cost and take the cheapest edge that connects two different components. Union-Find enforces the no-cycle rule. The greedy choice is safe by the cut property: the cheapest edge crossing a component boundary can be part of an MST.”

Resume language for DSU work

Make the pattern concrete:

“Modeled entity resolution as disjoint-set union over stable identifiers, reducing repeated graph traversals to near-constant connectivity checks.”
“Implemented Kruskal-style cost optimization using Union-Find to prevent cycles and track connected components.”
“Built dynamic component counting for grid updates with path compression and union by size.”

Union-Find is not a general graph hammer. It is a specialized tool for equivalence and connectivity. When you can reduce a problem to merging sets and asking whether two things share a root, it turns complex graph state into a small, reliable abstraction.

Weighted and labeled Union-Find variations

Some advanced interview problems attach information to the relationship between a node and its root. “Evaluate Division” is the classic example: if a / b = 2.0, you need not only connectivity but also the ratio between connected variables. A weighted DSU stores a parent plus a multiplier from each node to its parent. During path compression, it updates both the parent and the accumulated weight. The same idea can represent parity, distance modulo a number, or “same group versus opposite group” constraints.

You do not need to lead with weighted DSU unless the prompt requires relationship values. The decision rule is: if the query asks only whether two things are connected, plain DSU is enough. If the query asks what value relates them, the root must carry extra metadata.

DSU versus DFS tradeoff

For a static graph with one component-count query, DFS is often just as clear. Union-Find shines when edges arrive incrementally, when you need to detect the exact redundant edge, or when many connectivity queries follow a batch of unions. In an interview, it is fine to say, “DFS would work, but DSU gives simpler incremental merging and a direct cycle signal through failed union.” That kind of tradeoff answer sounds more senior than forcing DSU everywhere.

Testing edge cases

Test duplicate edges, self-loops, disconnected nodes, and one-node graphs. A self-loop should usually make union(a, a) return false, which may indicate a cycle. Duplicate edges should not decrement the component count twice. If labels are strings, test two identifiers that appear only in queries and never in union input; you may need to initialize them lazily or return false because they are unknown.

Graph Algorithms LeetCode Pattern Guide — DFS, BFS, Dijkstra, and Union-Find — A practical graph algorithms LeetCode pattern guide covering DFS, BFS, Dijkstra, union-find, topological sort, grid graphs, and interview explanation patterns. Use it to identify the graph shape before choosing the algorithm.
Backtracking LeetCode Pattern Guide — Permutations, Combinations, and Subsets — A practical backtracking LeetCode pattern guide with templates, decision rules, duplicate-handling tricks, pruning advice, and interview talk tracks for permutations, combinations, subsets, and constraint problems.
BFS LeetCode Pattern Guide — Shortest-Path, Level-Order, and Multi-Source — A BFS LeetCode pattern guide for recognizing queue-based problems, implementing level-order traversal, solving unweighted shortest paths, using multi-source BFS, and avoiding common visited-state bugs.
Binary Search LeetCode Pattern Guide — Bounds, Rotated Arrays, and Search-on-Answer — A binary search LeetCode pattern guide for exact search, lower/upper bounds, rotated arrays, and search-on-answer problems. Includes invariants, templates, traps, and interview-ready explanations.
Bit Manipulation LeetCode Pattern Guide — XOR Tricks, Masks, and Counting Bits — A practical bit manipulation LeetCode pattern guide covering XOR tricks, masks, counting bits, subset enumeration, flags, and the interview pitfalls around signed integers.

Union-Find LeetCode Pattern Guide — Disjoint Sets, Connectivity, and Kruskal's

What Union-Find stores

When to reach for Union-Find

Connectivity and component counting

Kruskal's algorithm: greedy plus DSU

Modeling non-obvious nodes

Accounts merge

Equations possible

Grid islands

Path compression and union by size explained

Common traps

Decision rules in interviews

How to explain Union-Find clearly

Resume language for DSU work

Weighted and labeled Union-Find variations

DSU versus DFS tradeoff

Testing edge cases

Related guides

More in Skills and frameworks

A/B Testing Interview Questions in 2026 — Power Analysis, Peeking, and SRM

API Design Interview Cheatsheet in 2026 — Patterns, Examples, Practice Plan, and Common Traps

API Design Interview Guide — REST vs GraphQL vs gRPC, Versioning, and Pagination