Disjoint sets are a data structure that extends the functionality of sets by maintaining multiple disjoint subsets of elements, i.e., a disjoint set with two subsets have:
These find considerable use in graph problems, especially in applying the basic union and find operations (union-find algorithm).
Union-find maintains a few structures:
- The array
link[]
contains for each element, the next element in the chain. At the beginning of the chain, thelink[i]
containsi
.- We initialise the
i
th element in the array with valuei
.
- We initialise the
- The array
size[]
contains the size of each subset, given thei
-th element.
There are also two main functions:
- The
find()
algorithm returns the representative for an elementx
, which can be found by following the chain that begins atx
.- Note that two elements are only in the same group if they have the same representatives, i.e., their
find()
results are the same. - Path compression is a simple optimisation. Instead of recursively finding the representative element of the set (with ), we can point the parent directly to the representative element, such that it’ll have (the inverse Ackermann function).
- Note that two elements are only in the same group if they have the same representatives, i.e., their
- The
unite()
function joins the initially disjoint sets that contains elements and . It finds the representatives of the subsets, then connects the smaller set to the larger set.
Competitive programming
We can use union find primarily for finding the number of graphs contained in an input. In C++, we have: