python pool apply_async and map_async do not block on full queue

后端 未结 4 1792
被撕碎了的回忆
被撕碎了的回忆 2020-12-29 11:40

I am fairly new to python. I am using the multiprocessing module for reading lines of text on stdin, converting them in some way and writing them into a database. Here\'s a

4条回答
  •  死守一世寂寞
    2020-12-29 12:27

    Just in case some one ends up here, this is how I solved the problem: I stopped using multiprocessing.Pool. Here is how I do it now:

    #set amount of concurrent processes that insert db data
    processes = multiprocessing.cpu_count() * 2
    
    #setup batch queue
    queue = multiprocessing.Queue(processes * 2)
    
    #start processes
    for _ in range(processes): multiprocessing.Process(target=insert, args=(queue,)).start() 
    
    #fill queue with batches    
    batch=[]
    for i, content in enumerate(sys.stdin):
        batch.append(content)
        if len(batch) >= 10000:
            queue.put((batch,i+1))
            batch = []
    if batch:
        queue.put((batch,i+1))
    
    #stop processes using poison-pill
    for _ in range(processes): queue.put((None,None))
    
    print "all done."
    

    in the insert method the processing of each batch is wrapped in a loop that pulls from the queue until it receives the poison pill:

    while True:
        batch, end = queue.get()
        if not batch and not end: return #poison pill! complete!
        [process the batch]
    print 'worker done.'
    

提交回复
热议问题