python-multiprocessing

How to run several Keras neural networks in parallel

此生再无相见时 提交于 2019-12-19 05:46:08
问题 I'm trying to use Keras to run a reinforcement learning algorithm. In this algorithm, I'm training a neural network. What's different from other learning problems is that I need to use the neural network itself to generate training data, and repeat this after it updates. I run into problem when I'm trying to generate training data in parallel. The problem is that I can't tell Theano to use GPU while training because it will also use GPU when generating training data and cause problems if

Python Multiprocessing of NLTK word_tokenizer - function never completes

萝らか妹 提交于 2019-12-19 04:38:07
问题 I'm performing natural language processing using NLTK on some fairly large datasets and would like to take advantage of all my processor cores. Seems the multiprocessing module is what I'm after, and when I run the following test code I see all cores are being utilized, but the code never completes. Executing the same task, without multiprocessing, finishes in approximately one minute. Python 2.7.11 on debian. from nltk.tokenize import word_tokenize import io import time import

Python Multiprocessing of NLTK word_tokenizer - function never completes

折月煮酒 提交于 2019-12-19 04:38:03
问题 I'm performing natural language processing using NLTK on some fairly large datasets and would like to take advantage of all my processor cores. Seems the multiprocessing module is what I'm after, and when I run the following test code I see all cores are being utilized, but the code never completes. Executing the same task, without multiprocessing, finishes in approximately one minute. Python 2.7.11 on debian. from nltk.tokenize import word_tokenize import io import time import

Running multiple tensorflow sessions concurrently

自古美人都是妖i 提交于 2019-12-18 12:54:45
问题 I am trying to run several sessions of TensorFlow concurrently on a CentOS 7 machine with 64 CPUs. My colleague reports that he can use the following two blocks of code to produce a parallel speedup on his machine using 4 cores: mnist.py import numpy as np import input_data from PIL import Image import tensorflow as tf import time def main(randint): print 'Set new seed:', randint np.random.seed(randint) tf.set_random_seed(randint) mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)

Python multiprocessing: why are large chunksizes slower?

淺唱寂寞╮ 提交于 2019-12-18 12:34:23
问题 I've been profiling some code using Python's multiprocessing module (the 'job' function just squares the number). data = range(100000000) n=4 time1 = time.time() processes = multiprocessing.Pool(processes=n) results_list = processes.map(func=job, iterable=data, chunksize=10000) processes.close() time2 = time.time() print(time2-time1) print(results_list[0:10]) One thing I found odd is that the optimal chunksize appears to be around 10k elements - this took 16 seconds on my computer. If I

Process vs. Thread with regards to using Queue()/deque() and class variable for communication and “poison pill”

佐手、 提交于 2019-12-18 05:22:08
问题 I would like to create either a Thread or a Process which runs forever in a While True loop. I need to send and receive data to the worker in the form for queues, either a multiprocessing.Queue() or a collections.deque(). I prefer to use collections.deque() as it is significantly faster. I also need to be able to kill the worker eventually (as it runs in a while True loop. Here is some test code I've put together to try and understand the differences between Threads, Processes, Queues, and

Python multiprocessing.cpu_count() returns '1' on 4-core Nvidia Jetson TK1

喜你入骨 提交于 2019-12-18 04:40:22
问题 Can anyone tell me why Python's multiprocessing.cpu_count() function would return 1 when when called on a Jetson TK1 with four ARMv7 processors? >>> import multiprocessing >>> multiprocessing.cpu_count() 1 The Jetson TK1 board is more or less straight out of the box, and no one has messed with cpusets. From within the same Python shell I can print the contents of /proc/self/status and it tells me that the process should have access to all four cores: >>> print open('/proc/self/status').read()

Keras + Tensorflow: Prediction on multiple gpus

谁都会走 提交于 2019-12-18 04:37:08
问题 I'm using Keras with tensorflow as backend. I have one compiled/trained model. My prediction loop is slow so I would like to find a way to parallelize the predict_proba calls to speed things up. I would like to take a list of batches (of data) and then per available gpu, run model.predict_proba() over a subset of those batches. Essentially: data = [ batch_0, batch_1, ... , batch_N ] on gpu_0 => return predict_proba(batch_0) on gpu_1 => return predict_proba(batch_1) ... on gpu_N => return

How to terminate long-running computation (CPU bound task) in Python using asyncio and concurrent.futures.ProcessPoolExecutor?

China☆狼群 提交于 2019-12-18 04:27:06
问题 Similar Question (but answer does not work for me): How to cancel long-running subprocesses running using concurrent.futures.ProcessPoolExecutor? Unlike the question linked above and the solution provided, in my case the computation itself is rather long (CPU bound) and cannot be run in a loop to check if some event has happened. Reduced version of the code below: import asyncio import concurrent.futures as futures import time class Simulator: def __init__(self): self._loop = None self._lmz

why is multiprocess Pool slower than a for loop?

喜你入骨 提交于 2019-12-18 04:25:06
问题 from multiprocessing import Pool def op1(data): return [data[elem] + 1 for elem in range(len(data))] data = [[elem for elem in range(20)] for elem in range(500000)] import time start_time = time.time() re = [] for data_ in data: re.append(op1(data_)) print('--- %s seconds ---' % (time.time() - start_time)) start_time = time.time() pool = Pool(processes=4) data = pool.map(op1, data) print('--- %s seconds ---' % (time.time() - start_time)) I get a much slower run time with pool than I get with