urllib2

HTTP: Proxy Authentification Error for nltk.download()

半城伤御伤魂 提交于 2019-12-14 03:43:46
问题 I am using nltk.download() to download the packages i need. But i am getting the following error. root@nishant-Inspiron-1545:/home/nishant/Dropbox/DDP/data# python Python 2.7.3 (default, Apr 10 2013, 05:09:49) [GCC 4.7.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import nltk >>> import nltk.downloader >>> nltk.download() NLTK Downloader --------------------------------------------------------------------------- d) Download l) List c) Config h) Help q

urllib2: submitting a form and then redirecting

落爺英雄遲暮 提交于 2019-12-14 02:45:48
问题 My goal is to come up with a portable urllib2 solution that would POST a form and then redirect the user to what comes out. The POSTing part is simple: request = urllib2.Request('https://some.site/page', data=urllib.urlencode({'key':'value'})) response = urllib2.urlopen(request) Providing data sets request type to POST. Now, what I suspect all the data I should care about comes from response.info() & response.geturl() . I should do a self.redirect(response.geturl()) inside a get(self) method

corrupt zip download urllib2

只愿长相守 提交于 2019-12-13 18:49:41
问题 I am trying to download zip files from measuredhs.com using the following code: url ='https://dhsprogram.com/customcf/legacy/data/download_dataset.cfm?Filename=BFBR62DT.ZIP&Tp=1&Ctry_Code=BF' request = urllib2.urlopen(url) output = open("install.zip", "w") output.write(request.read()) output.close() However the downloaded file does not open. I get a message saying the compressed zip folder is invalid. To access the download link, one needs to long in, which I have done so. If i click on the

Python Mechanize to check if a server is available

試著忘記壹切 提交于 2019-12-13 16:15:55
问题 I'm trying to write a script which will read a file containing some urls and then open a browser instance using mechanize module. I'm just wondering how I can do so if some url does not exist or if the server is unreachable. For Example import mechanize br = mechanize.Browser() b = br.open('http://192.168.1.30/index.php') What I want to know is how I will get information from mechanize if 192.168.1.30 is unreachable or if http returns 404 Error. 回答1: from mechanize import Browser browser =

Why this request doesn't work?

Deadly 提交于 2019-12-13 16:15:19
问题 I want to make a simple stupid twitter app using Twitter API. If I request this page from my browser it does work: http://search.twitter.com/search.atom?q=hello&rpp=10&page=1 but if I request this page from python using urllib or urllib2 most of the times it doesn't work: response = urllib2.urlopen("http://search.twitter.com/search.atom?q=hello&rpp=10&page=1") and I get this error: Traceback (most recent call last): File "twitter.py", line 24, in <module> response = urllib2.urlopen("http:/

Opening a website frame or image in python

。_饼干妹妹 提交于 2019-12-13 15:42:30
问题 So i am fairly fluent with python and have used urllib2 and Cookies a lot for website automation. I just stumbled upon the "webbrowser" module which can open a url in your default browser. Im wondering if its possible to select just one object from that url and open that up. Specifically i want to open a "captcha" so that the user can input it, and continue doing something else. this is line containing the captcha in the html, i think: script type="text/javascript" src="http://api.recaptcha

Urllib2- fetch and show any language page, encoding problem

自作多情 提交于 2019-12-13 11:52:08
问题 I'm using Python Google App Engine to simply fetch html pages and show it. My aim is to be able to fetch any page in any language. Now I have a problem with encoding: Simple result = urllib2.urlopen(url).read() leaves artifacts in place of special letters and urllib2.urlopen(url).read().decode('utf8') throws error: 'utf8' codec can't decode bytes in position 3544-3546: invalid data So how to solve it? Is there any lib that would check what encoding page is and convert so it would be readable?

Python urllib2 or requests post method [duplicate]

左心房为你撑大大i 提交于 2019-12-13 11:27:02
问题 This question already has answers here : Submitting to a web form using python (3 answers) Closed 3 years ago . I understand in general how to make a POST request using urllib2 (encoding the data, etc.), but the problem is all the tutorials online use completely useless made-up example urls to show how to do it ( someserver.com , coolsite.org , etc.), so I can't see the specific html that corresponds to the example code they use. Even python.org 's own tutorial is totally useless in this

python urllib2 can open localhost but not 127.0.0.1

元气小坏坏 提交于 2019-12-13 10:48:22
问题 I am using python urllib2 library and can see a strange and nasty problem. Windows 7. My code: import urllib2 as url_request opener = url_request.build_opener(url_request.ProxyHandler({'http': 'http://login:password@server:8080'})) request = url_request.Request("http://localhost"); response = opener.open(request) print response.read() It works perfectly well, but when I change localhost to 127.0.0.1 this error happens: HTTPError: HTTP Error 502: Proxy Error ( Forefront TMG denied the

Python: Find a Sentence between some website-tags using regex

守給你的承諾、 提交于 2019-12-13 09:38:10
问题 I want to find a sentence between the ...class="question-hyperlink"> tags. With this code: import urllib2 import re response = urllib2.urlopen('https://stackoverflow.com/questions/tagged/python') html = response.read(20000) a = re.search('question-hyperlink', html) print html[a.end()+3:a.end()+100] I get: DF5 for Python: high level vs low level interfaces. h5py</a></h3> <div class="excerpt"> How can I stop at the next < ? And how do I find the next sentence? I want to do it with regex. EDIT