urllib

urllib.error.URLError: <urlopen error unknown url type: 'https>

社会主义新天地 提交于 2019-11-27 09:31:45
(Python 3.4.2) I've got a weird error when I run 'urllib.request.urlopen(url)' inside of a script. If I run it directly in the Python interpreter, it works fine, but not when I run it inside of a script through a bash shell (Linux). I'm guessing it has something to do with the 'url' string, maybe because I'm creating the string through the 'string.join' method. import urllib.request url = "".join((baseurl, other_string, midurl, query)) response = urllib.request.urlopen(url) The 'url' string prints perfectly, but when I try to create the 'response' string, I get this output: File "./script.py",

Python 3 urllib produces TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str

霸气de小男生 提交于 2019-11-27 09:01:16
I am trying to convert working Python 2.7 code into Python 3 code and I am receiving a type error from the urllib request module. I used the inbuilt 2to3 Python tool to convert the below working urllib and urllib2 Python 2.7 code: import urllib2 import urllib url = "https://www.customdomain.com" d = dict(parameter1="value1", parameter2="value2") req = urllib2.Request(url, data=urllib.urlencode(d)) f = urllib2.urlopen(req) resp = f.read() The output from the 2to3 module was the below Python 3 code: import urllib.request, urllib.error, urllib.parse url = "https://www.customdomain.com" d = dict

download image from url using python urllib but receiving HTTP Error 403: Forbidden

▼魔方 西西 提交于 2019-11-27 08:57:21
I want to download image file from a url using python module "urllib.request", which works for some website (e.g. mangastream.com), but does not work for another (mangadoom.co) receiving error "HTTP Error 403: Forbidden". What could be the problem for the latter case and how to fix it? I am using python3.4 on OSX. import urllib.request # does not work img_url = 'http://mangadoom.co/wp-content/manga/5170/886/005.png' img_filename = 'my_img.png' urllib.request.urlretrieve(img_url, img_filename) At the end of error message it said: ... HTTPError: HTTP Error 403: Forbidden However, it works for

How to save “complete webpage” not just basic html using Python

大兔子大兔子 提交于 2019-11-27 08:19:52
I am using following code to save webpage using Python: import urllib import sys from bs4 import BeautifulSoup url = 'http://www.vodafone.de/privat/tarife/red-smartphone-tarife.html' f = urllib.urlretrieve(url,'test.html') Problem : This code saves html as basic html without javascripts, images etc. I want to save webpage as complete (Like we have option in browser) Update : I am using following code now to save all the js/images/css files of webapge so that it can be saved as complete webpage but still my output html is getting saved like basic html: import pycurl import StringIO c = pycurl

BeautifulSoup, where are you putting my HTML?

风格不统一 提交于 2019-11-27 08:16:24
问题 I'm using BS4 with python2.7. Here's the start of my code (Thanks root): from bs4 import BeautifulSoup import urllib2 f=urllib2.urlopen('http://yify-torrents.com/browse-movie') html=f.read() soup=BeautifulSoup(html) When I print html, its contents are the same as the source of the page viewed in chrome. When I print soup however, it cuts out all the entire body and leaves me with this (the contents of the head tag): <!DOCTYPE html> <html> <head> <title>Browse Movie - YIFY Torrents</title>

RPA手把手——urllib.request.urlretrieve()实现文件下载进度查看

孤者浪人 提交于 2019-11-27 07:45:46
艺赛旗 RPA9.0全新首发免费下载 点击下载 http://www.i-search.com.cn/index.html?from=line1 函数参数说明 urllib.request.urlretrieve(url, filename=None, reporthook=None, data=None) url: 文件下载链接 filename: 文件下载路径(如果参数未指定,urllib 会生成一个临时文件保存数据) reporthook: 回调函数,当连接上服务器、以及相应的数据块传输完毕时会触发该回调,我们可以利用这个回调函数来显示当前的下载进度 data: 指 post 到服务器的数据。该方法返回一个包含两个元素的元组 (filename, headers),filename 表示保存到本地的路径,header 表示服务器的响应头 示例 import urllib.request def download(): #下载链接 down_url=r" https://av.sc.com/hk/zh/content/docs/hk-c-nddr-ff304m-ag-20190809.pdf " #储存路径 down_path=r"C:\Users\Administrator\Desktop\test\aa.pdf" #链接转义,防止链接中有中文或空格而报错 down_url

Python 3.6 urllib TypeError: can't concat bytes to str

老子叫甜甜 提交于 2019-11-27 07:39:08
问题 I'm trying to pull some JSON data from an API using urllib in Python 3.6. It requires header information to be passed for authorization. Here is my code: import urllib.request, json headers = {"authorization" : "Bearer {authorization_token}"} with urllib.request.urlopen("{api_url}", data=headers) as url: data = json.loads(url.read().decode()) print(data) And the error message I get: Traceback (most recent call last): File "getter.py", line 5, in <module> with urllib.request.urlopen("{url}",

urllib “module object is not callable”

我的未来我决定 提交于 2019-11-27 07:02:16
问题 This is my third python project, and I've received an error message: 'module object' is not callable . I know that this means I'm referencing a variable or function incorrectly. But trial and error hasn't been able to help me solve this. import urllib def get_url(url): '''get_url accepts a URL string and return the server response code, response headers, and contents of the file''' req_headers = { 'User-Agent': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/525.13 (KHTML, like

Python3 error: initial_value must be str or None

孤街浪徒 提交于 2019-11-27 06:44:13
While porting code from python2 to 3 , I get this error when reading from a URL TypeError: initial_value must be str or None, not bytes. import urllib import json import gzip from urllib.parse import urlencode from urllib.request import Request service_url = 'https://babelfy.io/v1/disambiguate' text = 'BabelNet is both a multilingual encyclopedic dictionary and a semantic network' lang = 'EN' Key = 'KEY' params = { 'text' : text, 'key' : Key, 'lang' :'EN' } url = service_url + '?' + urllib.urlencode(params) request = Request(url) request.add_header('Accept-encoding', 'gzip') response = urllib

Download Returned Zip file from URL

余生长醉 提交于 2019-11-27 06:22:52
If I have a URL that, when submitted in a web browser, pops up a dialog box to save a zip file, how would I go about catching and downloading this zip file in Python? senderle Use urllib2.urlopen . The return value is a file-like object that you can read() , pass to zipfile and so on. yoavram As far as I can tell, the proper way to do this is: import requests, zipfile, StringIO r = requests.get(zip_file_url, stream=True) z = zipfile.ZipFile(StringIO.StringIO(r.content)) z.extractall() of course you'd want to check that the GET was successful with r.ok . For python 3+, sub the StringIO module