urllib | 易学教程

urllib.error.URLError: <urlopen error unknown url type: 'https>

阅读更多关于 urllib.error.URLError:

(Python 3.4.2) I've got a weird error when I run 'urllib.request.urlopen(url)' inside of a script. If I run it directly in the Python interpreter, it works fine, but not when I run it inside of a script through a bash shell (Linux). I'm guessing it has something to do with the 'url' string, maybe because I'm creating the string through the 'string.join' method. import urllib.request url = "".join((baseurl, other_string, midurl, query)) response = urllib.request.urlopen(url) The 'url' string prints perfectly, but when I try to create the 'response' string, I get this output: File "./script.py",

Python 3 urllib produces TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str

阅读更多关于 Python 3 urllib produces TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str

I am trying to convert working Python 2.7 code into Python 3 code and I am receiving a type error from the urllib request module. I used the inbuilt 2to3 Python tool to convert the below working urllib and urllib2 Python 2.7 code: import urllib2 import urllib url = "https://www.customdomain.com" d = dict(parameter1="value1", parameter2="value2") req = urllib2.Request(url, data=urllib.urlencode(d)) f = urllib2.urlopen(req) resp = f.read() The output from the 2to3 module was the below Python 3 code: import urllib.request, urllib.error, urllib.parse url = "https://www.customdomain.com" d = dict

download image from url using python urllib but receiving HTTP Error 403: Forbidden

阅读更多关于 download image from url using python urllib but receiving HTTP Error 403: Forbidden

I want to download image file from a url using python module "urllib.request", which works for some website (e.g. mangastream.com), but does not work for another (mangadoom.co) receiving error "HTTP Error 403: Forbidden". What could be the problem for the latter case and how to fix it? I am using python3.4 on OSX. import urllib.request # does not work img_url = 'http://mangadoom.co/wp-content/manga/5170/886/005.png' img_filename = 'my_img.png' urllib.request.urlretrieve(img_url, img_filename) At the end of error message it said: ... HTTPError: HTTP Error 403: Forbidden However, it works for

How to save “complete webpage” not just basic html using Python

阅读更多关于 How to save “complete webpage” not just basic html using Python

I am using following code to save webpage using Python: import urllib import sys from bs4 import BeautifulSoup url = 'http://www.vodafone.de/privat/tarife/red-smartphone-tarife.html' f = urllib.urlretrieve(url,'test.html') Problem : This code saves html as basic html without javascripts, images etc. I want to save webpage as complete (Like we have option in browser) Update : I am using following code now to save all the js/images/css files of webapge so that it can be saved as complete webpage but still my output html is getting saved like basic html: import pycurl import StringIO c = pycurl

BeautifulSoup, where are you putting my HTML?

阅读更多关于 BeautifulSoup, where are you putting my HTML?

问题 I'm using BS4 with python2.7. Here's the start of my code (Thanks root): from bs4 import BeautifulSoup import urllib2 f=urllib2.urlopen('http://yify-torrents.com/browse-movie') html=f.read() soup=BeautifulSoup(html) When I print html, its contents are the same as the source of the page viewed in chrome. When I print soup however, it cuts out all the entire body and leaves me with this (the contents of the head tag): <!DOCTYPE html> <html> <head> <title>Browse Movie - YIFY Torrents</title>

RPA手把手——urllib.request.urlretrieve（）实现文件下载进度查看

阅读更多关于 RPA手把手——urllib.request.urlretrieve（）实现文件下载进度查看

艺赛旗 RPA9.0全新首发免费下载点击下载 http://www.i-search.com.cn/index.html?from=line1 函数参数说明 urllib.request.urlretrieve(url, filename=None, reporthook=None, data=None) url: 文件下载链接 filename: 文件下载路径（如果参数未指定，urllib 会生成一个临时文件保存数据） reporthook: 回调函数，当连接上服务器、以及相应的数据块传输完毕时会触发该回调，我们可以利用这个回调函数来显示当前的下载进度 data: 指 post 到服务器的数据。该方法返回一个包含两个元素的元组 (filename, headers)，filename 表示保存到本地的路径，header 表示服务器的响应头示例 import urllib.request def download(): #下载链接 down_url=r" https://av.sc.com/hk/zh/content/docs/hk-c-nddr-ff304m-ag-20190809.pdf " #储存路径 down_path=r"C:\Users\Administrator\Desktop\test\aa.pdf" #链接转义，防止链接中有中文或空格而报错 down_url

Python 3.6 urllib TypeError: can't concat bytes to str

阅读更多关于 Python 3.6 urllib TypeError: can't concat bytes to str

问题 I'm trying to pull some JSON data from an API using urllib in Python 3.6. It requires header information to be passed for authorization. Here is my code: import urllib.request, json headers = {"authorization" : "Bearer {authorization_token}"} with urllib.request.urlopen("{api_url}", data=headers) as url: data = json.loads(url.read().decode()) print(data) And the error message I get: Traceback (most recent call last): File "getter.py", line 5, in <module> with urllib.request.urlopen("{url}",

urllib “module object is not callable”

阅读更多关于 urllib “module object is not callable”

问题 This is my third python project, and I've received an error message: 'module object' is not callable . I know that this means I'm referencing a variable or function incorrectly. But trial and error hasn't been able to help me solve this. import urllib def get_url(url): '''get_url accepts a URL string and return the server response code, response headers, and contents of the file''' req_headers = { 'User-Agent': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like

Python3 error: initial_value must be str or None

阅读更多关于 Python3 error: initial_value must be str or None

While porting code from python2 to 3 , I get this error when reading from a URL TypeError: initial_value must be str or None, not bytes. import urllib import json import gzip from urllib.parse import urlencode from urllib.request import Request service_url = 'https://babelfy.io/v1/disambiguate' text = 'BabelNet is both a multilingual encyclopedic dictionary and a semantic network' lang = 'EN' Key = 'KEY' params = { 'text' : text, 'key' : Key, 'lang' :'EN' } url = service_url + '?' + urllib.urlencode(params) request = Request(url) request.add_header('Accept-encoding', 'gzip') response = urllib

Download Returned Zip file from URL

阅读更多关于 Download Returned Zip file from URL

If I have a URL that, when submitted in a web browser, pops up a dialog box to save a zip file, how would I go about catching and downloading this zip file in Python? senderle Use urllib2.urlopen . The return value is a file-like object that you can read() , pass to zipfile and so on. yoavram As far as I can tell, the proper way to do this is: import requests, zipfile, StringIO r = requests.get(zip_file_url, stream=True) z = zipfile.ZipFile(StringIO.StringIO(r.content)) z.extractall() of course you'd want to check that the GET was successful with r.ok . For python 3+, sub the StringIO module