I have a script that includes opening a file from a list and then doing something to the text within that file. I\'m using python multiprocessing and Pool to try to parallel
apply_async farms out one task to the pool. You would need to call
apply_async many times to exercise more processors.results. Since the pool workers are separate processes, the two
won't be writing to the same list. One way to work around this is to use an ouput Queue. You could set it up yourself, or use apply_async's callback to setup the Queue for you. apply_async will call the callback once the function completes. map_async instead of apply_async, but then you'd
get a list of lists, which you'd then have to flatten.So, perhaps try instead something like:
import os
import multiprocessing as mp
results = []
def testFunc(file):
result = []
print "Working in Process #%d" % (os.getpid())
# This is just an illustration of some logic. This is not what I'm
# actually doing.
with open(file, 'r') as f:
for line in f:
if 'dog' in line:
result.append(line)
return result
def collect_results(result):
results.extend(result)
if __name__ == "__main__":
p = mp.Pool(processes=2)
files = ['/path/to/file1.txt', '/path/to/file2.txt']
for f in files:
p.apply_async(testFunc, args=(f, ), callback=collect_results)
p.close()
p.join()
print(results)