Is python uuid1 sequential as timestamps?

后端 未结 5 1239
孤城傲影
孤城傲影 2020-12-19 00:34

Python docs states that uuid1 uses current time to form the uuid value. But I could not find a reference that ensures UUID1 is sequential.

>>> impor         


        
5条回答
  •  轻奢々
    轻奢々 (楼主)
    2020-12-19 00:55

    Argumentless use of uuid.uuid1() gives non-sequential results (see answer by @basil-bourque), but it can be easily made sequential if you set clock_seq or node arguments (because in this case uuid1 uses python implementation that guarantees to have unique and sequential timestamp part of the UUID in current process):

    import time
    
    from uuid import uuid1, getnode
    from random import getrandbits
    
    _my_clock_seq = getrandbits(14)
    _my_node = getnode()
    
    
    def sequential_uuid(node=None):
        return uuid1(node=node, clock_seq=_my_clock_seq)
    
    
    def alt_sequential_uuid(clock_seq=None):
        return uuid1(node=_my_node, clock_seq=clock_seq)
    
    
    
    if __name__ == '__main__':
        from itertools import count
        old_n = uuid1()  # "Native"
        old_s = sequential_uuid()  # Sequential
    
        native_conflict_index = None
    
        t_0 = time.time()
    
        for x in count():
            new_n = uuid1()
            new_s = sequential_uuid()
    
            if old_n > new_n and not native_conflict_index:
                native_conflict_index = x
    
            if old_s >= new_s:
                print("OOops: non-sequential results for `sequential_uuid()`")
                break
    
            if (x >= 10*0x3fff and time.time() - t_0 > 30) or (native_conflict_index and x > 2*native_conflict_index):
                print('No issues for `sequential_uuid()`')
                break
    
            old_n = new_n
            old_s = new_s
    
        print(f'Conflicts for `uuid.uuid1()`: {bool(native_conflict_index)}')
        print(f"Tries: {x}")
    
    

    Multiple processes issues

    BUT if you are running some parallel processes on the same machine, then:

    • node which defaults to uuid.get_node() will be the same for all the processes;
    • clock_seq has small chance to be the same for some processes (chance of 1/16384)

    That might lead to conflicts! That is general concern for using uuid.uuid1 in parallel processes on the same machine unless you have access to SafeUUID from Python3.7.

    If you make sure to also set node to unique value for each parallel process that runs this code, then conflicts should not happen.

    Even if you are using SafeUUID, and set unique node, it's still possible to have non-sequential ids if they are generated in different processes.

    If some lock-related overhead is acceptable, then you can store clock_seq in some external atomic storage (for example in "locked" file) and increment it with each call: this allows to have same value for node on all parallel processes and also will make id-s sequential. For cases when all parallel processes are subprocesses created using multiprocessing: clock_seq can be "shared" using multiprocessing.Value

提交回复
热议问题