Writing to a file with multiprocessing

前端未结

关注

 3  879

孤街浪徒

I\'m having the following problem in python.

I need to do some calculations in parallel whose results I need to be written sequentially in a file. So I created a fu

相关标签:

3条回答

陌清茗

2020-12-09 05:03
You really should use two queues and three separate kinds of processing.
1. Put stuff into Queue #1.
2. Get stuff out of Queue #1 and do calculations, putting stuff in Queue #2. You can have many of these, since they get from one queue and put into another queue safely.
3. Get stuff out of Queue #2 and write it to a file. You must have exactly 1 of these and no more. It "owns" the file, guarantees atomic access, and absolutely assures that the file is written cleanly and consistently.
0 讨论(0)
发布评论:

提交评论
- 加载中...

北海茫月

2020-12-09 05:23

If anyone is looking for a simple way to do the same, this can help you. I don't think there are any disadvantages to doing it in this way. If there are, please let me know.

import multiprocessing 
import re

def mp_worker(item):
    # Do something
    return item, count

def mp_handler():
    cpus = multiprocessing.cpu_count()
    p = multiprocessing.Pool(cpus)
    # The below 2 lines populate the list. This listX will later be accessed parallely. This can be replaced as long as listX is passed on to the next step.
    with open('ExampleFile.txt') as f:
        listX = [line for line in (l.strip() for l in f) if line]
    with open('results.txt', 'w') as f:
        for result in p.imap(mp_worker, listX):
            # (item, count) tuples from worker
            f.write('%s: %d\n' % result)

if __name__=='__main__':
    mp_handler()

Source: Python: Writing to a single file with queue while using multiprocessing Pool

0 讨论(0)

无人及你

2020-12-09 05:25
There is a mistake in the write worker code, if the block is false, the worker will never get any data. Should be as follows:
```
par, res = queue.get(block = True)
```
You can check it by adding line
```
 print "QSize",queueOut.qsize()
```
after the queueOut.put((par,res))

With block=False you would be getting ever increasing length of the queue until it fills up, unlike with block=True where you get always "1".
0 讨论(0)
发布评论:

提交评论
- 加载中...