python-multiprocessing | 易学教程

How to run several Keras neural networks in parallel

阅读更多关于 How to run several Keras neural networks in parallel

问题 I'm trying to use Keras to run a reinforcement learning algorithm. In this algorithm, I'm training a neural network. What's different from other learning problems is that I need to use the neural network itself to generate training data, and repeat this after it updates. I run into problem when I'm trying to generate training data in parallel. The problem is that I can't tell Theano to use GPU while training because it will also use GPU when generating training data and cause problems if

Python Multiprocessing of NLTK word_tokenizer - function never completes

阅读更多关于 Python Multiprocessing of NLTK word_tokenizer - function never completes

问题 I'm performing natural language processing using NLTK on some fairly large datasets and would like to take advantage of all my processor cores. Seems the multiprocessing module is what I'm after, and when I run the following test code I see all cores are being utilized, but the code never completes. Executing the same task, without multiprocessing, finishes in approximately one minute. Python 2.7.11 on debian. from nltk.tokenize import word_tokenize import io import time import

Python Multiprocessing of NLTK word_tokenizer - function never completes

阅读更多关于 Python Multiprocessing of NLTK word_tokenizer - function never completes

Running multiple tensorflow sessions concurrently

阅读更多关于 Running multiple tensorflow sessions concurrently

问题 I am trying to run several sessions of TensorFlow concurrently on a CentOS 7 machine with 64 CPUs. My colleague reports that he can use the following two blocks of code to produce a parallel speedup on his machine using 4 cores: mnist.py import numpy as np import input_data from PIL import Image import tensorflow as tf import time def main(randint): print 'Set new seed:', randint np.random.seed(randint) tf.set_random_seed(randint) mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Python multiprocessing: why are large chunksizes slower?

阅读更多关于 Python multiprocessing: why are large chunksizes slower?

问题 I've been profiling some code using Python's multiprocessing module (the 'job' function just squares the number). data = range(100000000) n=4 time1 = time.time() processes = multiprocessing.Pool(processes=n) results_list = processes.map(func=job, iterable=data, chunksize=10000) processes.close() time2 = time.time() print(time2-time1) print(results_list[0:10]) One thing I found odd is that the optimal chunksize appears to be around 10k elements - this took 16 seconds on my computer. If I

Process vs. Thread with regards to using Queue()/deque() and class variable for communication and “poison pill”

阅读更多关于 Process vs. Thread with regards to using Queue()/deque() and class variable for communication and “poison pill”

问题 I would like to create either a Thread or a Process which runs forever in a While True loop. I need to send and receive data to the worker in the form for queues, either a multiprocessing.Queue() or a collections.deque(). I prefer to use collections.deque() as it is significantly faster. I also need to be able to kill the worker eventually (as it runs in a while True loop. Here is some test code I've put together to try and understand the differences between Threads, Processes, Queues, and

Python multiprocessing.cpu_count() returns '1' on 4-core Nvidia Jetson TK1

阅读更多关于 Python multiprocessing.cpu_count() returns '1' on 4-core Nvidia Jetson TK1

问题 Can anyone tell me why Python's multiprocessing.cpu_count() function would return 1 when when called on a Jetson TK1 with four ARMv7 processors? >>> import multiprocessing >>> multiprocessing.cpu_count() 1 The Jetson TK1 board is more or less straight out of the box, and no one has messed with cpusets. From within the same Python shell I can print the contents of /proc/self/status and it tells me that the process should have access to all four cores: >>> print open('/proc/self/status').read()

Keras + Tensorflow: Prediction on multiple gpus

阅读更多关于 Keras + Tensorflow: Prediction on multiple gpus

问题 I'm using Keras with tensorflow as backend. I have one compiled/trained model. My prediction loop is slow so I would like to find a way to parallelize the predict_proba calls to speed things up. I would like to take a list of batches (of data) and then per available gpu, run model.predict_proba() over a subset of those batches. Essentially: data = [ batch_0, batch_1, ... , batch_N ] on gpu_0 => return predict_proba(batch_0) on gpu_1 => return predict_proba(batch_1) ... on gpu_N => return

How to terminate long-running computation (CPU bound task) in Python using asyncio and concurrent.futures.ProcessPoolExecutor?

阅读更多关于 How to terminate long-running computation (CPU bound task) in Python using asyncio and concurrent.futures.ProcessPoolExecutor?

问题 Similar Question (but answer does not work for me): How to cancel long-running subprocesses running using concurrent.futures.ProcessPoolExecutor? Unlike the question linked above and the solution provided, in my case the computation itself is rather long (CPU bound) and cannot be run in a loop to check if some event has happened. Reduced version of the code below: import asyncio import concurrent.futures as futures import time class Simulator: def __init__(self): self._loop = None self._lmz

why is multiprocess Pool slower than a for loop?

阅读更多关于 why is multiprocess Pool slower than a for loop?

问题 from multiprocessing import Pool def op1(data): return [data[elem] + 1 for elem in range(len(data))] data = [[elem for elem in range(20)] for elem in range(500000)] import time start_time = time.time() re = [] for data_ in data: re.append(op1(data_)) print('--- %s seconds ---' % (time.time() - start_time)) start_time = time.time() pool = Pool(processes=4) data = pool.map(op1, data) print('--- %s seconds ---' % (time.time() - start_time)) I get a much slower run time with pool than I get with