Good Style in Python Objects

后端 未结 5 1112
春和景丽
春和景丽 2021-01-19 23:42

Most of my programming prior to Python was in C++ or Matlab. I don\'t have a degree in CS (almost completed a PhD in physics), but have done some courses and a good amount o

5条回答
  •  野性不改
    2021-01-20 00:34

    One option is to use dictionaries to store the information you need about a data item, rather than attributes on the item directly. For instance, rather than referring to d.size you could refer to size[d] (where size is a dict instance). This requires that your data items be hashable, but they don't need to allow attributes to be assigned on them.

    Here's a straightforward translation of your current code to use this style:

    class UnionFind:
        def __init__(self,data):
            self.data = data
            self.size = {d:1 for d in data}
            self.leader = {d:d for d in data}
            self.next = {d:None for d in data}
            self.last = {d:d for d in data}
    
        def find(self,element):
            return self.leader[element]
    
        def union(self,leader1,leader2):
            if self.size[leader1] >= self.size[leader2]:
                newleader = leader1
                oldleader = leader2
            else:
                newleader = leader2
                oldleader = leader1
    
            self.size[newleader] = self.size[leader1] + self.size[leader2]
    
            d = oldleader
            while d != None:
                self.leader[d] = newleader
                d = self.next[d]
    
            self.next[self.last[newleader]] = oldleader
            self.last[newleader] = self.last[oldleader]
    

    A minimal test case:

    >>> uf = UnionFind(list(range(100)))
    >>> uf.find(10)
    10
    >>> uf.find(20)
    20
    >>> uf.union(10,20)
    >>> uf.find(10)
    10
    >>> uf.find(20)
    10
    

    Beyond this, you could also consider changing your implementation a bit to require less initialization. Here's a version that doesn't do any initialization (it doesn't even need to know the set of data it's going to work on). It uses path compression and union-by-rank rather than always maintaining an up-to-date leader value for all members of a set. It should be asymptotically faster than your current code, especially if you're doing a lot of unions:

    class UnionFind:
        def __init__(self):
            self.rank = {}
            self.parent = {}
    
        def find(self, element):
            if element not in self.parent: # leader elements are not in `parent` dict
                return element
            leader = self.find(self.parent[element]) # search recursively
            self.parent[element] = leader # compress path by saving leader as parent
            return leader
    
        def union(self, leader1, leader2):
            rank1 = self.rank.get(leader1,1)
            rank2 = self.rank.get(leader2,1)
    
            if rank1 > rank2: # union by rank
                self.parent[leader2] = leader1
            elif rank2 > rank1:
                self.parent[leader1] = leader2
            else: # ranks are equal
                self.parent[leader2] = leader1 # favor leader1 arbitrarily
                self.rank[leader1] = rank1+1 # increment rank
    

提交回复
热议问题