Basically, I\'m looking for something that offers a parallel map using python3 coroutines as the backend instead of threads or processes. I believe there should be less overhead
You could use greenlets (lightweight threads, basically coroutines) for this, or the somewhat higher-level gevent lib built on top of them:
(from the docs)
import gevent
from gevent import getcurrent
from gevent.pool import Group
group = Group()
def hello_from(n):
print('Size of group %s' % len(group))
print('Hello from Greenlet %s' % id(getcurrent()))
group.map(hello_from, xrange(3))
def intensive(n):
gevent.sleep(3 - n)
return 'task', n
print('Ordered')
ogroup = Group()
for i in ogroup.imap(intensive, xrange(3)):
print(i)
print('Unordered')
igroup = Group()
for i in igroup.imap_unordered(intensive, xrange(3)):
print(i)
Yields output:
Size of group 3
Hello from Greenlet 31904464
Size of group 3
Hello from Greenlet 31904944
Size of group 3
Hello from Greenlet 31905904
Ordered
('task', 0)
('task', 1)
('task', 2)
Unordered
('task', 2)
('task', 1)
('task', 0)
The standard constraints of lightweight-vs-proper-multicore-usage apply to greenlets vs threads. That is, they're concurrent but not necessarily parallel.
Quick edit for people who see this in the future, since Yaroslav has done a great job of outlining some differences between Python's asyncio and gevent:
Why gevent over async/await? (these are all super subjective but have applied to me in the past)
- Not portable/easily accesible (not just 2.X, but 3.5 brought new keywords)
- async and await have a tendency to spread and infect codebases - when someone else has encapsulated this for you, it's super duper nice in terms of development and readability/maintainability
- In addition to above, I (personally) feel like the high-level interface of gevent is very "pythonic".
- Less rope to hang yourself with. In simple examples the two seem similar, but the more you want to do with async calls, the more chance you have to fuck up something basic and create race conditions, locks, unexpected behaviors. No need to reinvent the noose imho.
- Gevent's performance scales past trivial examples and is used and tested in lots of production environments. If you don't know much about asynchronous programming, it's a good place to start.
Why asyncio and not Gevent?
- If you can guarantee a version of Python and don't have access to 3rd party packages/pip, it gives you out of the box support.
- Similar to above, if you don't want to be tied in to a project that's been slow to adopt Py3k, rolling your own small toolset is a good option.
- If you want to fine tune things, you're in charge!