urllib2

Python - Example of urllib2 asynchronous / threaded request using HTTPS

走远了吗. 提交于 2019-12-20 09:55:53
问题 I'm having a heck of a time getting asynchronous / threaded HTTPS requests to work using Python's urllib2. Does anyone out there have a basic example that implements urllib2.Request, urllib2.build_opener and a subclass of urllib2.HTTPSHandler? Thanks! 回答1: The code below does 7 http requests asynchronously at the same time. It does not use threads, instead it uses asynchronous networking with the twisted library. from twisted.web import client from twisted.internet import reactor, defer urls

Downloading videos in flv format from youtube.

倖福魔咒の 提交于 2019-12-20 08:47:24
问题 I cant really understand how youtube serves videos but I have been reading through what I can, it seems like the old method get_video is now obsolete and can't be used any more so because of this I am asking if there is another pythonic and simple method for collecting youtube videos. 回答1: You might have some luck with youtube-dl http://rg3.github.com/youtube-dl/documentation.html I'm not sure if there's a good API, but it's written in Python, so theoretically you could do something a little

How to use urllib2 to access ftp/http server using proxy with authentification

风格不统一 提交于 2019-12-20 06:19:12
问题 Update: see the comments for my solution. My python code uses urllib2 to access a FTP server through a proxy with user and password. I use both a urllib2.ProxyHandler and a urllib2.ProxyBasicAuthHandler to implement this by following urllib2 examples: 1 import urllib2 2 proxy_host = 'host.proxy.org:3128' # only host name, no scheme (http/ftp) 3 proxy_handler = urllib2.ProxyHandler({'ftp': proxy_host}) 4 proxy_auth_handler = urllib2.ProxyBasicAuthHandler() 5 proxy_auth_handler.add_password

Multi threaded web scraper using urlretrieve on a cookie-enabled site

你。 提交于 2019-12-20 03:17:05
问题 I am trying to write my first Python script, and with lots of Googling, I think that I am just about done. However, I will need some help getting myself across the finish line. I need to write a script that logs onto a cookie-enabled site, scrape a bunch of links, and then spawn a few processes to download the files. I have the program running in single-threaded, so I know that the code works. But, when I tried to create a pool of download workers, I ran into a wall. #manager.py import Fetch

scrape google resultstats with python [closed]

拟墨画扇 提交于 2019-12-20 01:36:42
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 4 years ago . I would like to get the estimated results number from google for a keyword. Im using Python3.3 and try to accomplish this task with BeautifulSoup and urllib.request. This is my simple code so far def numResults(): try: page_google = '''http://www.google.de/#output=search&sclient=psy-ab&q=pokerbonus&oq=pokerbonus

Python urllib2 force IPv4

别说谁变了你拦得住时间么 提交于 2019-12-19 19:45:00
问题 I am running a script using python that uses urllib2 to grab data from a weather api and display it on screen. I have had the problem that when I query the server, I get a "no address associated with hostname" error. I can view the output of the api with a web browser, and I can download the file with wget, but I have to force IPv4 to get it to work. Is it possible to force IPv4 in urllib2 when using urllib2.urlopen? 回答1: Not directly, no. So, what can you do? One possibility is to explicitly

Python urllib2 force IPv4

岁酱吖の 提交于 2019-12-19 19:44:08
问题 I am running a script using python that uses urllib2 to grab data from a weather api and display it on screen. I have had the problem that when I query the server, I get a "no address associated with hostname" error. I can view the output of the api with a web browser, and I can download the file with wget, but I have to force IPv4 to get it to work. Is it possible to force IPv4 in urllib2 when using urllib2.urlopen? 回答1: Not directly, no. So, what can you do? One possibility is to explicitly

How do I post non-ASCII characters using httplib when content-type is “application/xml”

心已入冬 提交于 2019-12-19 17:47:57
问题 I've implemented a Pivotal Tracker API module in Python 2.7. The Pivotal Tracker API expects POST data to be an XML document and "application/xml" to be the content type. My code uses urlib/httplib to post the document as shown: request = urllib2.Request(self.url, xml_request.toxml('utf-8') if xml_request else None, self.headers) obj = parse_xml(self.opener.open(request)) This yields an exception when the XML text contains non-ASCII characters: File "/usr/lib/python2.7/httplib.py", line 951,

How do I post non-ASCII characters using httplib when content-type is “application/xml”

折月煮酒 提交于 2019-12-19 17:47:12
问题 I've implemented a Pivotal Tracker API module in Python 2.7. The Pivotal Tracker API expects POST data to be an XML document and "application/xml" to be the content type. My code uses urlib/httplib to post the document as shown: request = urllib2.Request(self.url, xml_request.toxml('utf-8') if xml_request else None, self.headers) obj = parse_xml(self.opener.open(request)) This yields an exception when the XML text contains non-ASCII characters: File "/usr/lib/python2.7/httplib.py", line 951,

label empty or too long - python urllib2

冷暖自知 提交于 2019-12-19 12:49:25
问题 I am having a strange situation: i am curling urls like this: def check_urlstatus(url): h = httplib2.Http() try: resp = h.request("http://" + url, 'HEAD') if int(resp[0]['status']) < 400: return 'ok' else: return 'bad' except httplib2.ServerNotFoundError: return 'bad' if I try to test this with: if check_urlstatus('.f.de') == "bad": #<--- error happening here #.. #.. it is saying: UnicodeError: label empty or too long what is the problem i am causing here? EDIT : here is the traceback with