python-multiprocessing | 易学教程

Does multiprocessing copy the object in this scenario?

阅读更多关于 Does multiprocessing copy the object in this scenario?

问题 import multiprocessing import numpy as np import multiprocessing as mp import ctypes class Test(): def __init__(self): shared_array_base = multiprocessing.Array(ctypes.c_double, 100, lock=False) self.a = shared_array = np.ctypeslib.as_array(shared_array_base) def my_fun(self,i): self.a[i] = 1 if __name__ == "__main__": num_cores = multiprocessing.cpu_count() t = Test() def my_fun_wrapper(i): t.my_fun(i) with mp.Pool(num_cores) as p: p.map(my_fun_wrapper, np.arange(100)) print(t.a) In the code

Cupy get error in multithread.pool if GPU already used

阅读更多关于 Cupy get error in multithread.pool if GPU already used

I tried to use cupy in two parts of my program, one of them being parallelized with a pool. I managed to reproduce it with a simple example: import cupy import numpy as np from multiprocessing import pool def f(x): return cupy.asnumpy(2*cupy.array(x)) input = np.array([1,2,3,4]) print(cupy.asnumpy(cupy.array(input))) print(np.array(list(map(f, input)))) p = pool.Pool(4) output = p.map(f, input) p.close() p.join() print(output) The output is the following: [1 2 3 4] [2 4 6 8] Exception in thread Thread-3: Traceback (most recent call last): File "/usr/lib/python3.6/threading.py", line 916, in

Sharing a multiprocessing synchronization primitive across processes

阅读更多关于 Sharing a multiprocessing synchronization primitive across processes

问题 (Python 3.4, Linux). I have a main process 'P', which forks 8 processes ('C1' through 'C8'). I want to create multiprocessing.Barrier that ensures all the 8 child processes are in sync at a certain point. Everything works fine if I define the synchronization primitive in the parent process, so that when I fork the child processes it is properly inherited: import multiprocessing as mp barrier = mp.Barrier(8) def f(): # do something barrier.wait() # do more stuff def main(): for i in range(8):

Multiprocessing - using the Managers Namespace to save memory

阅读更多关于 Multiprocessing - using the Managers Namespace to save memory

I have several processes each completing tasks which require a single large numpy array, this is only being read (the threads are searching it for appropriate values). If each process loads the data I receive a memory error. I am therefore trying to minimise the memory usage by using a Manager to share the same array between the processes. However I still receive a memory error. I can load the array once in the main process however the moment I try to make it an attribute of the manager namespace I receive a memory error . I assumed the Managers acted like pointers and allowed seperate

Optimizing multiprocessing.Pool with expensive initialization

阅读更多关于 Optimizing multiprocessing.Pool with expensive initialization

问题 Here is a complete simple working example import multiprocessing as mp import time import random class Foo: def __init__(self): # some expensive set up function in the real code self.x = 2 print('initializing') def run(self, y): time.sleep(random.random() / 10.) return self.x + y def f(y): foo = Foo() return foo.run(y) def main(): pool = mp.Pool(4) for result in pool.map(f, range(10)): print(result) pool.close() pool.join() if __name__ == '__main__': main() How can I modify it so Foo is only

Using keras.utils.Sequence multiprocessing and data base - when to connect?

阅读更多关于 Using keras.utils.Sequence multiprocessing and data base - when to connect?

I'm training a neural network with Keras with Tensorflow backend. Data set does not fit in RAM, therefore, I store it in the Mongo database and retrieve batches using subclass of keras.utils.Sequence . Everything works fine, if I run model.fit_generator() with use_multiprocessing=False . When I turn on multiprocessing, I get errors either during spawning of workers or in connection to the data base. If I create a connection in __init__ , I've got an exception whose text says something about errors in pickling lock objects. Sorry, I don't remember exactly. But the training even does not start.

What's the point of multithreading in Python if the GIL exists?

阅读更多关于 What's the point of multithreading in Python if the GIL exists?

问题 From what I understand, the GIL makes it impossible to have threads that harness a core each individually. This is a basic question, but, what is then the point of the threading library? It seems useless if the threaded code has equivalent speed to a normal program. 回答1: In some cases an application may not utilize even one core fully and using threads (or processes) may help to do that. Think of a typical web application. It receives requests from clients, does some queries to the database

why is more than one worker used in `multiprocessing.Pool().apply_async()`?

阅读更多关于 why is more than one worker used in `multiprocessing.Pool().apply_async()`?

Problem From the multiprocessing.Pool docs : apply_async(func ...) : A variant of the apply() method which returns a result object. ... Reading further ... apply(func[, args[, kwds]]) : Call func with arguments args and keyword arguments kwds. It blocks until the result is ready. Given this blocks, apply_async() is better suited for performing work in parallel. Additionally, func is only executed in one of the workers of the pool. The last bold line suggests only one worker from a pool is used. I find this is only true under certain conditions. Given Here is code that executes Pool.apply_async

Pool workers do not complete all tasks

阅读更多关于 Pool workers do not complete all tasks

I have a relatively simple python multiprocessing script that sets up a pool of workers that append output to a pandas dataframe by way of a custom manager. What I am finding is when I call close()/join() on the pool, not all the tasks submitted by apply_async are being completed. Here's a simplified example that submits 1000 jobs but only half complete causing an assertion error. Have I overlooked something very simple or is this perhaps a bug? from pandas import DataFrame from multiprocessing.managers import BaseManager, Pool class DataFrameResults: def __init__(self): self.results =

Scraping concurrently with selenium in python

阅读更多关于 Scraping concurrently with selenium in python

I am trying to scrape concurrently with selenium and multiprocessing modules. Below is roughly my approach: create queue with number of webdriver instances equal to number of workers create pool of workers each worker pulls webdriver instance from the queue when function terminates webdriver instance is put back on the queue Here is the code: #!/usr/bin/env python # encoding: utf-8 import time import codecs from selenium import webdriver from selenium.webdriver.common.desired_capabilities import DesiredCapabilities from multiprocessing import Pool from Queue import Queue def download_and_save