urllib2

Python urllib2 resume download doesn't work when network reconnects

∥☆過路亽.° 提交于 2019-12-03 22:52:10
问题 I'm using urllib2 to make a resuming downloader, roughly based on this method. I can end the program and re-start it, and it starts downloading where it left off, downloading the file that ends up the same size as if it were downloaded all at once. However, I have tested it when disabling and reenabling network, and it doesn't download correctly. The file size ends up longer than the file should be, and the file doesn't work correctly. Is there something I missed, or could this be a urllib2

Mechanze form submission causes 'Assertion Error' in response when .read() is attempted

故事扮演 提交于 2019-12-03 22:42:37
I am writing a web-crawl program with python and am unable to login using mechanize. The form on the site looks like: <form method="post" action="PATLogon"> <h2 align="center"><img src="/myaladin/images/aladin_logo_rd.gif"></h2> <!-- ALADIN Request parameters --> <input type=hidden name=req value="db"> <input type=hidden name=key value="PROXYAUTH"> <input type=hidden name=url value="http://eebo.chadwyck.com/search"> <input type=hidden name=lib value="8"> <table> <tr><td><b>Last Name:</b></td> <td><input name=LN size=20 maxlength=26></td> <tr><td><b>University ID or Library Barcode:</b></td>

How do I gracefully interrupt urllib2 downloads?

ぐ巨炮叔叔 提交于 2019-12-03 22:25:11
I am using urllib2 's build_opener() to create an OpenerDirector . I am using the OpenerDirector to fetch a slow page and so it has a large timeout. So far, so good. However, in another thread, I have been told to abort the download - let's say the user has selected to exit the program in the GUI. Is there a way to signal an urllib2 download should quit? Oddthinking There is no clean answer. There are several ugly ones. Initially, I was putting rejected ideas in the question. As it has become clear that there are no right answers, I decided to post the various sub-optimal alternatives as a

urllib2 is throwing an error for an url , while it's opening properly in browser

三世轮回 提交于 2019-12-03 21:52:57
I am trying to open an url through python like this import urllib2 f = urllib2.urlopen('http://www.futurebazaar.com/Search/laptop') It's throwing following error File "C:\Python26\lib\urllib2.py", line 1134, in do_open r = h.getresponse() File "C:\Python26\lib\httplib.py", line 986, in getresponse response.begin() File "C:\Python26\lib\httplib.py", line 391, in begin version, status, reason = self._read_status() File "C:\Python26\lib\httplib.py", line 355, in _read_status raise BadStatusLine(line) httplib.BadStatusLine But this url is opening via browser. The website is broken. If the optional

python urllib2 utf-8 encoding

巧了我就是萌 提交于 2019-12-03 21:31:30
okay, I have: # -*- coding: utf-8 -*- in my python file. the snippet: opener = urllib2.build_opener() opener.addheaders = [('User-agent', 'Mozilla/5.0')] opener.addheaders = [('Accept-Charset', 'utf-8')] f =opener.open(url) doc = f.read().decode('utf-8') The server response is: (via f.info()) Content-Type: text/html; charset=UTF-8 but i get the error: UnicodeDecodeError: 'utf8' codec can't decode byte[...]: invalid continuation byte What's wrong here? Raymond Hettinger Try decoding the data using 'latin-1' to see what it looks like. What you're seeing indicates a UTF-8 decode error (see

Urllib2- fetch and show any language page, encoding problem

╄→尐↘猪︶ㄣ 提交于 2019-12-03 20:56:55
I'm using Python Google App Engine to simply fetch html pages and show it. My aim is to be able to fetch any page in any language. Now I have a problem with encoding: Simple result = urllib2.urlopen(url).read() leaves artifacts in place of special letters and urllib2.urlopen(url).read().decode('utf8') throws error: 'utf8' codec can't decode bytes in position 3544-3546: invalid data So how to solve it? Is there any lib that would check what encoding page is and convert so it would be readable? Jask rajax sugested at How to download any(!) webpage with correct charset in python? to use chardet

HandShake Failure in python(_ssl.c:590)

久未见 提交于 2019-12-03 20:41:31
问题 When I execute the below line, req = urllib2.Request(requestwithtoken) self.response = urllib2.urlopen(req,self.request).read() I am getting the following exception: SSLError: [SSL: SSLV3_ALERT_HANDSHAKE_FAILURE] sslv3 alert handshake failure (_ssl.c:590) The thing is I am able to get the token by pinging the service by using curl . During the process of retrieving the token, all the certificates were verified. In turn, by using the generated token, i am not able to connect to the service. I

Python Splinter (SeleniumHQ) how to take a screenshot of many webpages? [Connection refused]

久未见 提交于 2019-12-03 20:22:33
I want to take a screenshot of many webpages, I wrote this: from splinter.browser import Browser import urllib2 from urllib2 import URLError urls = ['http://ubuntu.com/', 'http://xubuntu.org/'] try : browser = Browser('firefox') for i in range(0, len(urls)) : browser.visit(urls[i]) if browser.status_code.is_success() : browser.driver.save_screenshot('your_screenshot' + str(i) + '.png') browser.quit() except SystemError : print('install firefox!') except urllib2.URLError, e: print(e) print('theres no such website') except Exception, e : print(e) browser.quit() and I got this error: <urlopen

Sublime_ctags函数跳转

流过昼夜 提交于 2019-12-03 19:17:33
(windows环境下)看了很多安装ctags的方法 sublime text 2或3中安装完ctags ,对函数进行两次ctrl+T,发现不能跳转,报错: can't find any relevent tags file ------------------------ 解决方法 ---------------------------------------------------------- 原因是因为没有生成.tags索引文件 生成方法有两种 : 1. sublime软件中 ,功能菜单File->Open Folder 打开要分析的目标文件夹,从左侧边栏中点击打开任意程序文件,按下 crtl+T、ctrl+R组合键 (即:最上行功能菜单Find->Ctags->rebuild tags),这时可以看到目标文件夹中生成了.tags索引文件,以及.tags_sorted_by_file。 2. Win+R :输入指令cmd,回车。cd指令进入要分析的目标文件夹,输入:ctags -R -f .tags 手动生成.tags文件。与方法一不同的是,不会生成.tags_sorted_by_file,不过在使用中尚未发现区别,对此欢迎大家完善补充。 附文: ------------------------- 以下是windows平台(win8可用)安装ctags完整步骤 ------

sublime text2中文乱码问题解决

时光总嘲笑我的痴心妄想 提交于 2019-12-03 19:15:02
sublime text 2 不支持 GB2312 和 GBK 编码, 解决这一问题。具体方法如下: 这一方法前提是 sublime text 2 的安装路径没有中文字符,且 系统路径的 %username% 中不包含中文字符 1. 安装 Sublime Package Control: 在Sublime Text 2上用Ctrl+~打开控制台并在里面输入以下代码,Sublime Text 2就会自动安装Package Control。 import urllib2,os; pf='Package Control.sublime-package'; ipp=sublime.installed_packages_path(); os.makedirs(ipp) if not os.path.exists(ipp) else None; urllib2.install_opener(urllib2.build_opener(urllib2.ProxyHandler())); open(os.path.join(ipp,pf),'wb').write(urllib2.urlopen('http://sublime.wbond.net/'+pf.replace(' ','%20')).read()); print('Please restart Sublime Text to finish