Union find

并查集:用于处理一些不交集(Disjoint sets)的合并及查询问题。

在求最小生成树的 Kruskal 算法 中就用到了并查集。

算法

添加

新添加一个元素时,它的代表是它自己。特别地,初始化 union find 时可用

repres = [i for i in range(n)]

查询

每个集合的代表是集合的根节点。在查询时进行路径压缩。

def find_repr(x):
    if repres[x] == x:
        return x
    else:
        repres[x] = find_repre(repres[x])
        return repres[x]

简化为

def find_repr(x):
    if repres[x] != x:
        repres[x] = find_repre(repres[x])
    return repres[x]

合并

如果两个节点的代表不同,则将它们的代表相连。注意最后一行中是对两节点的代表(ri 和 rj)进行操作,而不是 i 和 j。

def merge(i, j):
    ri = find_repr(i)
    rj = find_repr(j)
    if ri != rj:
        repres[ri] = rj

时间复杂度

单次操作平均时间复杂度 O(α(n))(反阿克曼函数),接近常数时间。

相关题目

LeetCode 547. Number of Provinces

Python 解法:

def findCircleNum(self, isConnected: List[List[int]]) -> int:
    '''
    Union Find solution. Begins by seting the number of provinces to the
    number of cities, then union each pair of the connected cities. If they
    belong to different sets before the union, reduce the number of
    provinces by 1 (unioned two cities into one province).
    '''
    city_num = len(isConnected)
    representative = [i for i in range(city_num)]
                                                                            
    def find_repr(c):
        if representative[c] == c:
            return c
        else:
            representative[c] = find_repr(representative[c])
            return representative[c]
                                                                            
    ret = city_num
    for i in range(city_num):
        for j in range(i + 1, city_num):
            if not isConnected[i][j]:
                continue
            if (ri := find_repr(i)) != (rj := find_repr(j)):
                # Connected cities are in different sets. Union them and
                # reduce the number of provinces
                representative[ri] = rj
                ret -= 1
    return ret

另见