multiprocessing.pool.MaybeEncodingError: Error sending result: Reason: 'TypeError(“cannot serialize '_io.BufferedReader' object”,)'

孤者浪人 提交于 2019-12-04 15:01:56

First couple of advices:

  1. You should always check how well is project maintained. Apparently wget package is not.
  2. You should check which libs is package using, in case something like this happens.

Now, to the issue.

Apparently wget uses urllib.request for making request. After some testing, I concluded that it doesn't handle all HTTP status codes. More specifically, it somehow breaks when HTTP status is, for example, 304. This is why you have to use libraries with higher level interface. Even the urllib.request says this in official documentation:

The Requests package is recommended for a higher-level HTTP client interface.

So, without further ado, here is the working snippet.

You can just update with where you want to save files.

from multiprocessing import Pool

import shutil
import requests


def f(args):
    print(args)
    req = requests.get(args[1], stream=True)
    with open(str(args[0]), 'wb') as f:
        shutil.copyfileobj(req.raw, f)

if __name__ == "__main__":
    a = Pool(2)
    a.map(f, enumerate(urls))  # urls is a list of urls.

shutil lib is used for file manipulation. In this case, to stream the data to a file object.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!