python-multiprocessing

Python multiprocessing - How can I split workload to get speed improvement?

时光怂恿深爱的人放手 提交于 2019-12-09 03:21:34
问题 I am writing a simple code of cropping images and saving it. But the problem is that the number of images is about 150,000+ and I want to improve the speed. So, at first I wrote a code with simple for loops, like the following: import cv2 import numpy import sys textfile=sys.argv[1] file_list=open(textfile) files=file_list.read().split('\n') idx=0 for eachfile in files: image=cv2.imread(eachfile) idx+=1 if image is None: pass outName=eachfile.replace('/data','/changed_data') if image.shape[0]

Can't read/write to files using multithreading in python

元气小坏坏 提交于 2019-12-08 22:32:13
问题 I have an input file which will contains a long list of URLs. Lets assume this in mylines.txt : https://yahoo.com https://google.com https://facebook.com https://twitter.com What I need to do is: 1) Read a line from the input file mylines.txt 2) Execute myFun function. Which will perform some tasks. And produce an output consists of a line. It is more complex in my real code. But something like this in concept. 3) Write the output to the results.txt file Since I have large input. I need to

Python child process silently crashes when issuing an HTTP request

馋奶兔 提交于 2019-12-08 21:26:32
问题 I'm running into an issue when combining multiprocessing, requests (or urllib2) and nltk. Here is a very simple code: >>> from multiprocessing import Process >>> import requests >>> from pprint import pprint >>> Process(target=lambda: pprint( requests.get('https://api.github.com'))).start() >>> <Response [200]> # this is the response displayed by the call to `pprint`. A bit more details on what this piece of code does: Import a few required modules Start a child process Issue an HTTP GET

How to terminate a multiprocess in python when a given condition is met? [duplicate]

假如想象 提交于 2019-12-08 19:45:33
This question already has answers here : Terminate a Python multiprocessing program once a one of its workers meets a certain condition (4 answers) Closed last year . Let's say I have the function: def f(): while True: x = generate_something() if x == condition: return x if __name__ == '__main__': p=Pool(4) I want to run this function in a multiprocess and when one of the processes meets my function's condition, I want all other processes to stop. You can use event and terminate in multiprocessing since you want to stop all processes once condition is met in one of the child process. Check the

Multiprocessing and Selenium Python

亡梦爱人 提交于 2019-12-08 19:42:18
问题 I have 3 drivers (Firefox browsers) and I want them to do something in a list of websites. I have a worker defined as: def worker(browser, queue): while True: id_ = queue.get(True) obj = ReviewID(id_) obj.search(browser) if obj.exists(browser): print(obj.get_url(browser)) else: print("Nothing") So the worker will just acces to a queue that contains the ids and use the browser to do something. I want to have a pool of workers so that as soon as a worker has finished using the browser to do

psycopg2 error: DatabaseError: error with no message from the libpq

扶醉桌前 提交于 2019-12-08 15:20:36
问题 I have an application that parses and loads data from csv files into a Postgres 9.3 database. In serial execution insert statements/cursor executions work without an issue. I added celery in the mix to add parallel parsing and inserting of the data files. Parsing works fine. However, I go to run insert statements and I get: [2015-05-13 11:30:16,464: ERROR/Worker-1] ingest_task.work_it: Exception Traceback (most recent call last): File "ingest_tasks.py", line 86, in work_it rowcount = ingest

Python multiprocessing pool map with multiple arguments [duplicate]

梦想的初衷 提交于 2019-12-08 14:44:43
This question already has answers here : Python multiprocessing pool.map for multiple arguments (18 answers) Closed 2 years ago . I have a function to be called from multiprocessing pool.map with multiple arguments. from multiprocessing import Pool import time def printed(num,num2): print 'here now ' return num class A(object): def __init__(self): self.pool = Pool(8) def callme(self): print self.pool.map(printed,(1,2),(3,4)) if __name__ == '__main__': aa = A() aa.callme() but it gives me following error TypeError: printed() takes exactly 2 arguments (1 given) I have tried solutions from other

Can you iterate a DataFrame without copying memory?

早过忘川 提交于 2019-12-08 12:48:18
问题 I am launching a Process that fetches a couple of gigs of data from a database into a DataFrame with a date index. From there I create a Manager to store that data and call a function using a Pool to utilize CPU cores. Since I have so much data, I need to use shared memory for the pooled methods to work on. import multiprocessing as mp import pandas as pd # Launched from __main__ class MyProcess(mp.Process): def calculate(): # Get and preprocess data allMyData = <fetch from Mongo> aggs =

Why no errors from multiprocessing is reported in Python and how to switch on reporting errors?

落爺英雄遲暮 提交于 2019-12-08 08:49:31
I setup some simple code to test some problem handling with multiprocessing and I can not track the bug inside this code because not feedback from processes. How can I receive exception from subprocesses since now I am blind to it. How to debug this code. # coding=utf-8 import multiprocessing import multiprocessing.managers import logging def callback(result): print multiprocessing.current_process().name, 'callback', result def worker(io_lock, value): # error raise RuntimeError() result = value + 1 with io_lock: print multiprocessing.current_process().name, value, result return result def main

Share object state across processes?

百般思念 提交于 2019-12-08 08:08:17
问题 In the code below, how do I make the Starter object be able to read gen.vals ? It seems like a different object gets created, whose state gets updated, but Starter never knows about it. Also, how would the solution apply for self.vals being a dictionary, or any other kind of object? import multiprocessing import time class Generator(multiprocessing.Process): def __init__(self): self.vals = [] super(Generator, self).__init__() def run(self): i = 0 while True: time.sleep(1) self.vals.append(i)