Threading in python doesn't happen parallel

前端未结

关注

 2  1325

I\'m doing data scraping calls with an urllib2, yet they each take around 1 seconds to complete. I was trying to test if I could multi-thread the URL-call loop into threadin

相关标签:

2条回答

时光说笑

2020-12-04 01:42

To get multiple urls in parallel limiting to 20 connections at a time:

import urllib2
from multiprocessing.dummy import Pool

def generate_urls(): # generate some dummy urls
    for i in range(100):
        yield 'http://example.com?param=%d' % i

def get_url(url):
    try: return url, urllib2.urlopen(url).read(), None
    except EnvironmentError as e:
         return url, None, e

pool = Pool(20) # limit number of concurrent connections
for url, result, error in pool.imap_unordered(get_url, generate_urls()):
    if error is None:
       print result,

0 讨论(0)

生来不讨喜

2020-12-04 02:02

Paul Seeb has correctly diagnosed your issue.

You are calling trade.update_items, and then passing the result to the threading.Thread constructor. Thus, you get serial behavior: your threads don't do any work, and the creation of each one is delayed until the update_items call returns.

The correct form is threading.Thread(target=trade.update_items, args=(1, 100) for the first line, and similarly for the later ones. This will pass the update_items function as the thread entry point, and the *[1, 100] as its positional arguments.

0 讨论(0)
发布评论:

提交评论
- 加载中...