python-multiprocessing

Python multiprocessing on a generator that reads files in

拜拜、爱过 提交于 2019-12-12 08:45:18
问题 I am trying to read and process 1000s of files, but unfortunately it takes about 3x as long to process the file as it does to read it in from disk, so I would like to process these files as they are read in (and while I am continuing to read in additional files). In a perfect world, I have a generator which reads one file at a time, and I would like to pass this generator to a pool of workers which process items from the generator as they are (slowly) generated. Here's an example: def process

Python - How to pass global variable to multiprocessing.Process?

社会主义新天地 提交于 2019-12-12 06:09:21
问题 I need to terminate some processes after a while, so I've used sleeping another process for the waiting. But the new process doesn't have access to global variables from the main process I guess. How could I solve it please? Code: import os from subprocess import Popen, PIPE import time import multiprocessing log_file = open('stdout.log', 'a') log_file.flush() err_file = open('stderr.log', 'a') err_file.flush() processes = [] def processing(): print "processing" global processes global log

multiprocessing/threading: data appending & output return

余生颓废 提交于 2019-12-12 05:26:13
问题 I have a lengthy function called run below that contains a few instances of appending data. from multiprocessing import Process data = [] def run(): global data ... data.append(trace) ... if __name__ == '__main__': jobs = [] gen_count = 0 leaked_count = 0 system_count = 0 N = 100 for i in range(N): p = Process(target=run) jobs.append(p) p.start() However, using multiprocessing no data is appended. In addition, the function run returns several values that need to be added to gen_count , leaked

Thread identifier in multiprocessing pool workers

时光怂恿深爱的人放手 提交于 2019-12-12 05:06:47
问题 I believed Thread.ident as a unique identifier of threads but now I see different worker processes in multiprocessing.poo.Pool reporting same thread identifier by threading.current_thread().ident . How? 回答1: Depending on the platform, the ids may or may not be unique. The important thing to note here is that the python multiprocessing library actually uses processes instead of threads for multiprocessing, and so thread ids in between processes is actually a platform-specific implementation

multiprocessing.Array Segmentation fault

不打扰是莪最后的温柔 提交于 2019-12-12 05:06:05
问题 from multiprocessing import Value from multiprocessing.sharedctypes import RawArray from ctypes import addressof, c_char array = RawArray(c_char, b'shirt') array_length = array._length_ result_address = Value('i', 0) result_address.value = addressof(array) print((array._type_*array_length).from_address(result_address.value).value) In one process I pass result_address of type multiprocessing.Value to a child process, which then creates an array in shared memory and writes its address back to

Passing multiple arguments to pool.map using class function

倖福魔咒の 提交于 2019-12-12 03:54:51
问题 I'm trying to thread as described in this post, and also pass multiple arguments in Python 2.7 through a work-around described here. Right now I have something like this, a function that is part of class pair_scraper : def pool_threading(self): pool = ThreadPool(4) for username in self.username_list: master_list = pool.map(self.length_scraper2, itertools.izip(username*len(self.repo_list), itertools.repeat(self.repo_list))) def length_scraper2(self, username, repo): #code However, when I run

Python ThreadPool with limited task queue size

荒凉一梦 提交于 2019-12-12 03:53:09
问题 My problem is the following: I have a multiprocessing.pool.ThreadPool object with worker_count workers and a main pqueue from which I feed tasks to the pool. The flow is as follows: There is a main loop that gets an item of level level from pqueue and submits it tot the pool using apply_async . When the item is processed, it generates items of level + 1 . The problem is that the pool accepts all tasks and processes them in the order they were submitted. More precisely, what is happening is

Parallel compute task to brute-force in python

﹥>﹥吖頭↗ 提交于 2019-12-12 03:22:41
问题 /* This is not for anything illegal just that my school only uses 7 integers, and I want to see if I can get this to work in time as currently I need 1.59 years to crack a password. The school has it's own private server on site for anyone concerned and it's easily detectable. I'll do this only to me or my friends with their permission .*/ I just wanted to use multi processing or concurrent.futures to make this password cracker run in reasonable time. Here is my attempt at paralleling it

python - ImportError: cannot import name Pool

我只是一个虾纸丫 提交于 2019-12-12 03:13:32
问题 Code here: from multiprocessing import pool def worker(num): print 'Worker:', num return if __name__ == '__main__': jobs = [] for i in range(5): p = multiprocessing.Process(target=worker, args=(i,)) jobs.append(p) p.start() Sorry I am new to python. I am getting the below error whenever I try to import pool. It says something wrong with os.chdir(wdir) but I cant figure out what. Any help ? Traceback (most recent call last): File "<stdin>", line 1, in <module> File "C:\Users\z080302\Desktop

Python: Multiprocessing Map takes longer to complete last few processes

╄→尐↘猪︶ㄣ 提交于 2019-12-12 02:56:56
问题 In Python, I'm trying to run 150-200 processes. I have these 150 things in an array, and I've split this array up into multiple arrays of 10 elements each. Now, I run a Multiprocessing Map, with 10 elements at a time. Once all 10 are complete, we go onto the next 10, and so on. Now, the problem: The ninth and tenth process are almost ALWAYS slower than the rest. Is there a reason for that? Am I not doing this the most efficient way? ** I won't be able to share the code for this. So do you