Threading in python doesn't happen parallel

前端 未结 2 1328
孤街浪徒
孤街浪徒 2020-12-04 01:28

I\'m doing data scraping calls with an urllib2, yet they each take around 1 seconds to complete. I was trying to test if I could multi-thread the URL-call loop into threadin

2条回答
  •  时光说笑
    2020-12-04 01:42

    To get multiple urls in parallel limiting to 20 connections at a time:

    import urllib2
    from multiprocessing.dummy import Pool
    
    def generate_urls(): # generate some dummy urls
        for i in range(100):
            yield 'http://example.com?param=%d' % i
    
    def get_url(url):
        try: return url, urllib2.urlopen(url).read(), None
        except EnvironmentError as e:
             return url, None, e
    
    pool = Pool(20) # limit number of concurrent connections
    for url, result, error in pool.imap_unordered(get_url, generate_urls()):
        if error is None:
           print result,
    

提交回复
热议问题