concurrent.futures

How to spawn future only if free worker is available

余生颓废 提交于 2019-12-01 01:44:47
I am trying to send information extracted from lines of a big file to a process running on some server. To speed this up, I would like to do this with some threads in parallel. Using the Python 2.7 backport of concurrent.futures I tried this: f = open("big_file") with ThreadPoolExecutor(max_workers=4) as e: for line in f: e.submit(send_line_function, line) f.close() However, this is problematic, because all futures get submitted instantly, so that my machine runs out of memory, because the complete file gets loaded into memory. My question is, if there is an easy way to only submit a new

Python ThreadPoolExecutor - is the callback guaranteed to run in the same thread as submitted func?

我的未来我决定 提交于 2019-11-30 17:52:38
In the ThreadPoolExecutor (TPE), is the callback always guaranteed to run in the same thread as the submitted function? For example, I tested this with the following code. I ran it many times and it seemed like func and callback always ran in the same thread. import concurrent.futures import random import threading import time executor = concurrent.futures.ThreadPoolExecutor(max_workers=3) def func(x): time.sleep(random.random()) return threading.current_thread().name def callback(future): time.sleep(random.random()) x = future.result() cur_thread = threading.current_thread().name if (cur

What are the advantages of concurrent.futures over multiprocessing in Python?

大城市里の小女人 提交于 2019-11-30 13:43:23
问题 I'm writing an app in Python and I need to run some tasks simultaneously. The module multiprocessing offers the class Process and the concurrent.futures module has the class ProcessPoolExecutor. Both seem to use multiple processes to execute their tasks, but their APIs are different. Why should I use one over the other? I know that concurrent.futures was added in Python 3, so I guess it's better? 回答1: The motivations for concurrent.futures are covered in the PEP. In my practical experience

What's the difference between python's multiprocessing and concurrent.futures?

末鹿安然 提交于 2019-11-30 11:33:28
A simple way of implementing multiprocessing in python is from multiprocessing import Pool def calculate(number): return number if __name__ == '__main__': pool = Pool() result = pool.map(calculate, range(4)) An alternative implementation based on futures is from concurrent.futures import ProcessPoolExecutor def calculate(number): return number with ProcessPoolExecutor() as executor: result = executor.map(calculate, range(4)) Both alternatives do essentially the same thing, but one striking difference is that we don't have to guard the code with the usual if __name__ == '__main__' clause. Is

ProcessPoolExecutor from concurrent.futures way slower than multiprocessing.Pool

血红的双手。 提交于 2019-11-30 10:29:39
问题 I was experimenting with the new shiny concurrent.futures module introduced in Python 3.2, and I've noticed that, almost with identical code, using the Pool from concurrent.futures is way slower than using multiprocessing.Pool. This is the version using multiprocessing: def hard_work(n): # Real hard work here pass if __name__ == '__main__': from multiprocessing import Pool, cpu_count try: workers = cpu_count() except NotImplementedError: workers = 1 pool = Pool(processes=workers) result =

How to detect exceptions in concurrent.futures in Python3?

时间秒杀一切 提交于 2019-11-30 08:28:43
问题 I have just moved on to python3 as a result of its concurrent futures module. I was wondering if I could get it to detect errors. I want to use concurrent futures to parallel program, if there are more efficient modules please let me know. I do not like multiprocessing as it is too complicated and not much documentation is out. It would be great however if someone could write a Hello World without classes only functions using multiprocessing to parallel compute so that it is easy to

Python: concurrent.futures How to make it cancelable?

别来无恙 提交于 2019-11-30 08:19:58
问题 Python concurrent.futures and ProcessPoolExecutor provide a neat interface to schedule and monitor tasks. Futures even provide a .cancel() method: cancel() : Attempt to cancel the call. If the call is currently being executed and cannot be cancelled then the method will return False, otherwise the call will be cancelled and the method will return True. Unfortunately in a simmilar question (concerning asyncio) the answer claims running tasks are uncancelable using this snipped of the

Is concurrent.futures a medicine of the GIL?

人盡茶涼 提交于 2019-11-30 05:19:16
I was just searching about this new implementation, and i use python 2.7, i must install this , so if i use it, i'll forget the word GIL on CPython? No, concurrent.futures has almost nothing whatsoever to do with the GIL. Using processes instead of threads is medicine for the GIL. (Of course, like all medicine, it has side effects. But it works.) The futures module just gives you a simpler way to schedule and wait on tasks than using threading or multiprocessing directly. And it has the added advantage that you can swap between a thread pool and a process pool (and maybe even a greenlet loop,

What is the difference between concurrent.futures and asyncio.futures?

非 Y 不嫁゛ 提交于 2019-11-30 04:36:38
To clarify the reason for this question: It is confusing to use two modules with the same name. What do they represent that makes them distinct? What task(s) can one solve that the other can't and vice-versa? The asyncio documentation covers the differences: class asyncio.Future(*, loop=None) This class is almost compatible with concurrent.futures.Future . Differences: result() and exception() do not take a timeout argument and raise an exception when the future isn’t done yet. Callbacks registered with add_done_callback() are always called via the event loop’s call_soon_threadsafe() . This

How to pass a function with more than one argument to python concurrent.futures.ProcessPoolExecutor.map()?

三世轮回 提交于 2019-11-30 01:51:58
问题 I would like concurrent.futures.ProcessPoolExecutor.map() to call a function consisting of 2 or more arguments. In the example below, I have resorted to using a lambda function and defining ref as an array of equal size to numberlist with an identical value. 1st Question: Is there a better way of doing this? In the case where the size of numberlist can be million to billion elements in size, hence ref size would have to follow numberlist, this approach unnecessarily takes up precious memory,