python-multiprocessing

Python multiprocessing - How can I split workload to get speed improvement?

泪湿孤枕 提交于 2019-12-01 10:52:48
I am writing a simple code of cropping images and saving it. But the problem is that the number of images is about 150,000+ and I want to improve the speed. So, at first I wrote a code with simple for loops, like the following: import cv2 import numpy import sys textfile=sys.argv[1] file_list=open(textfile) files=file_list.read().split('\n') idx=0 for eachfile in files: image=cv2.imread(eachfile) idx+=1 if image is None: pass outName=eachfile.replace('/data','/changed_data') if image.shape[0]==256: image1=image[120:170,120:170] elif image.shape[0]==50: image1=image cv2.imwrite(outName,image1)

multiprocessing.Queue as arg to pool worker aborts execution of worker

安稳与你 提交于 2019-12-01 09:47:18
I'm actually finding it hard to believe that I've run into the issue I have, it seems like it would be a big bug in the python multiprocessing module... Anyways the problem I've run into is that whenever I pass a multiprocessing.Queue to a multiprocessing.Pool worker as an argument the pool worker never executes its code. I've been able to reproduce this bug even on a very simple test that is a slightly modified version of example code found in the python docs . Here is the original version of the example code for Queues: from multiprocessing import Process, Queue def f(q): q.put([42, None,

python multiprocessing .join() deadlock depends on worker function

橙三吉。 提交于 2019-12-01 07:54:33
问题 I am using the multiprocessing python library to spawn 4 Process() objects to parallelize a cpu intensive task. The task (inspiration and code from this great article) is to compute the prime factors for every integer in a list. main.py: import random import multiprocessing import sys num_inputs = 4000 num_procs = 4 proc_inputs = num_inputs/num_procs input_list = [int(1000*random.random()) for i in xrange(num_inputs)] output_queue = multiprocessing.Queue() procs = [] for p_i in xrange(num

Multiprocessing works in Ubuntu, doesn't in Windows

ぐ巨炮叔叔 提交于 2019-12-01 06:59:21
I am trying to use this example as a template for a queuing system on my cherrypy app. I was able to convert it from python 2 to python 3 (change from Queue import Empty into from queue import Empty ) and to execute it in Ubuntu. But when I execute it in Windows I get the following error: F:\workspace\test>python test.py Traceback (most recent call last): File "test.py", line 112, in <module> broker.start() File "C:\Anaconda3\lib\multiprocessing\process.py", line 105, in start self._popen = self._Popen(self) File "C:\Anaconda3\lib\multiprocessing\context.py", line 212, in _Popen return

Why do concurrent.futures.ProcessPoolExecutor and multiprocessing.pool.Pool fail with super in Python?

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-01 05:59:09
Why does the following Python code using the concurrent.futures module hang forever? import concurrent.futures class A: def f(self): print("called") class B(A): def f(self): executor = concurrent.futures.ProcessPoolExecutor(max_workers=2) executor.submit(super().f) if __name__ == "__main__": B().f() The call raises an invisible exception [Errno 24] Too many open files (to see it, replace the line executor.submit(super().f) with print(executor.submit(super().f).exception()) ). However, replacing ProcessPoolExecutor with ThreadPoolExecutor prints "called" as expected. Why does the following

Why is my Python app stalled with 'system' / kernel CPU time

天涯浪子 提交于 2019-12-01 05:16:24
First off I wasn't sure if I should post this as a Ubuntu question or here. But I'm guessing it's more of an Python question than a OS one. My Python application is running on top of Ubuntu on a 64 core AMD server. It pulls images from 5 GigE cameras over the network by calling out to a .so through ctypes and then processes them. I am seeing frequent pauses in my application causing frames from the cameras to be dropped by the external camera library. To debug this I've used the popular psutil Python package with which I log out CPU stats every 0.2 seconds in a separate thread. I sleep for 0.2

Why do concurrent.futures.ProcessPoolExecutor and multiprocessing.pool.Pool fail with super in Python?

亡梦爱人 提交于 2019-12-01 04:02:51
问题 Why does the following Python code using the concurrent.futures module hang forever? import concurrent.futures class A: def f(self): print("called") class B(A): def f(self): executor = concurrent.futures.ProcessPoolExecutor(max_workers=2) executor.submit(super().f) if __name__ == "__main__": B().f() The call raises an invisible exception [Errno 24] Too many open files (to see it, replace the line executor.submit(super().f) with print(executor.submit(super().f).exception()) ). However,

How to run several Keras neural networks in parallel

萝らか妹 提交于 2019-12-01 03:58:26
I'm trying to use Keras to run a reinforcement learning algorithm. In this algorithm, I'm training a neural network. What's different from other learning problems is that I need to use the neural network itself to generate training data, and repeat this after it updates. I run into problem when I'm trying to generate training data in parallel. The problem is that I can't tell Theano to use GPU while training because it will also use GPU when generating training data and cause problems if invoked by multiple processes. What's more, I Theano wont run in multi-thread mode even when I write THEANO

Workaround for using __name__=='__main__' in Python multiprocessing

[亡魂溺海] 提交于 2019-12-01 02:36:07
问题 As we all know we need to protect the main() when running code with multiprocessing in Python using if __name__ == '__main__' . I understand that this is necessary in some cases to give access to functions defined in the main but I do not understand why this is necessary in this case: file2.py import numpy as np from multiprocessing import Pool class Something(object): def get_image(self): return np.random.rand(64,64) def mp(self): image = self.get_image() p = Pool(2) res1 = p.apply_async(np

Safe to call multiprocessing from a thread in Python?

南笙酒味 提交于 2019-12-01 02:24:24
问题 According to https://github.com/joblib/joblib/issues/180, and Is there a safe way to create a subprocess from a thread in python? the Python multiprocessing module does not allow use from within threads. Is this true? My understanding is that its fine to fork from threads, as long as you aren't holding a threading.Lock when you do so (in the current thread? anywhere in the process?). However, Python's documentation is silent on whether threading.Lock objects are safely shared after a fork.