python-multiprocessing

Why no errors from multiprocessing is reported in Python and how to switch on reporting errors?

試著忘記壹切 提交于 2019-12-08 07:22:53
问题 I setup some simple code to test some problem handling with multiprocessing and I can not track the bug inside this code because not feedback from processes. How can I receive exception from subprocesses since now I am blind to it. How to debug this code. # coding=utf-8 import multiprocessing import multiprocessing.managers import logging def callback(result): print multiprocessing.current_process().name, 'callback', result def worker(io_lock, value): # error raise RuntimeError() result =

Create new TCP Connections for every HTTP request in python

本秂侑毒 提交于 2019-12-08 06:01:20
问题 For my college project I am trying to develop a python based traffic generator.I have created 2 CentOS machines on vmware and I am using 1 as my client and 1 as my server machine. I have used IP aliasing technique to increase number of clients and severs using just single client/server machine. Upto now I have created 50 IP alias on my client machine and 10 IP alias on my server machine. I am also using multiprocessing module to generate traffic concurrently from all 50 clients to all 10

Does partial fit runs in parallel in sklearn.decomposition.IncrementalPCA?

南笙酒味 提交于 2019-12-08 05:33:55
问题 I've followed Imanol Luengo's answer to build a partial fit and transform for sklearn.decomposition.IncrementalPCA. But for some reason, it looks like (from htop) it uses all CPU cores at maximum. I could find neither n_jobs parameter nor anything related to multiprocessing. My question is: if this is default behavior of these functions how can I set the number of CPU's and where can I find information about it? If not, obviously I am doing something wrong in previous sections of my code. PS:

How to bind some variable to thread with concurrent.futures.ThreadPoolExecutor or multiprocessing.pool.ThreadPool?

☆樱花仙子☆ 提交于 2019-12-08 04:09:06
问题 What I want to do is something like this: class MyThread(threading.Thread): def __init__(self, host, port): threading.Thread.__init__(self) # self._sock = self.initsocket(host, port) self._id = random.randint(0, 100) def run(self): for i in range(3): print("current id: {}".format(self._id)) def main(): ts = [] for i in range(5): t = MyThread("localhost", 3001) t.start() ts.append(t) for t in ts: t.join() I got these output: current id: 10 current id: 10 current id: 13 current id: 43 current

Multiprocessing - using the Managers Namespace to save memory

匆匆过客 提交于 2019-12-08 03:57:58
问题 I have several processes each completing tasks which require a single large numpy array, this is only being read (the threads are searching it for appropriate values). If each process loads the data I receive a memory error. I am therefore trying to minimise the memory usage by using a Manager to share the same array between the processes. However I still receive a memory error. I can load the array once in the main process however the moment I try to make it an attribute of the manager

No output from Process using multiprocessing

我们两清 提交于 2019-12-07 12:49:58
问题 I am a beginner in multiprocessing, can anyone tell me why this does not produce any output? import multiprocessing def worker(num): """thread worker function""" print('Worker:', num) if __name__ == '__main__': jobs = [] for i in range(4): p = multiprocessing.Process(target=worker, args=(i,)) jobs.append(p) p.start() 回答1: You're starting your Process() , but never waiting on it to complete, so your program's execution ends before the background process finishes. Try this, with a call to

Scraping concurrently with selenium in python

ε祈祈猫儿з 提交于 2019-12-07 12:23:31
问题 I am trying to scrape concurrently with selenium and multiprocessing modules. Below is roughly my approach: create queue with number of webdriver instances equal to number of workers create pool of workers each worker pulls webdriver instance from the queue when function terminates webdriver instance is put back on the queue Here is the code: #!/usr/bin/env python # encoding: utf-8 import time import codecs from selenium import webdriver from selenium.webdriver.common.desired_capabilities

Why does memory consumption increase dramatically in `Pool.map()` multiprocessing?

雨燕双飞 提交于 2019-12-07 11:41:53
问题 I am doing a multiprocessing on a pandas dataframe by splitting it into several dataframes, which are stored as list. And, using Pool.map() I am passing the dataframe to a defined function. My input file is about "300 mb", so small dataframes are roughly "75 mb". But, when the multiprocessing is running the memory consumption increases by 7 GB and each local process consumes about approx. 2 GB of memory. Why is this happening? def main(): my_df = pd.read_table("my_file.txt", sep="\t") my_df =

Multiprocessing on a model with data frame as input

拜拜、爱过 提交于 2019-12-07 05:43:56
问题 I want to use multiprocessing on a model to get predictions using a data frame as input. I have the following code: def perform_model_predictions(model, dataFrame, cores=4): try: with Pool(processes=cores) as pool: result = pool.map(model.predict, dataFrame) return result # return model.predict(dataFrame) except AttributeError: logging.error("AttributeError occurred", exc_info=True) The error I'm getting is: raise TypeError("sparse matrix length is ambiguous; use getnnz()" TypeError: sparse

multiprocessing on tee'd generators

喜欢而已 提交于 2019-12-07 04:13:46
问题 Consider the following script in which I test two ways of performing some calculations on generators obtained by itertools.tee : #!/usr/bin/env python3 from sys import argv from itertools import tee from multiprocessing import Process def my_generator(): for i in range(5): print(i) yield i def double(x): return 2 * x def compute_double_sum(iterable): s = sum(map(double, iterable)) print(s) def square(x): return x * x def compute_square_sum(iterable): s = sum(map(square, iterable)) print(s) g1