urllib2 | 易学教程

Python: Log in a website using urllib

阅读更多关于 Python: Log in a website using urllib

问题 I want to log in to this website: https://www.fitbit.com/login This is my code I use: import urllib2 import urllib import cookielib login_url = 'https://www.fitbit.com/login' acc_pwd = {'login':'Log In','email':'username','password':'pwd'} cj = cookielib.CookieJar() ## add cookies opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj)) opener.addheaders = [('User-agent','Mozilla/5.0 \ (compatible; MSIE 6.0; Windows NT 5.1)')] data = urllib.urlencode(acc_pwd) try: opener.open(login_url

python urllib2 urlopen response

阅读更多关于 python urllib2 urlopen response

问题 python urllib2 urlopen response: <addinfourl at 1081306700 whose fp = <socket._fileobject object at 0x4073192c>> expected: {"token":"mYWmzpunvasAT795niiR"} 回答1: You need to bind the resultant file-like object to a variable, otherwise the interpreter just dumps it via repr : >>> import urllib2 >>> urllib2.urlopen('http://www.google.com') <addinfourl at 18362520 whose fp = <socket._fileobject object at 0x106b250>> >>> >>> f = urllib2.urlopen('http://www.google.com') >>> f <addinfourl at

urllib3 maxretryError

阅读更多关于 urllib3 maxretryError

I have just started using urllib3, and I am running into a problem straightaway. According to their manuals, I started off with the simple example: Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53) [GCC 4.5.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import urllib3 >>> >>> http = urllib3.PoolManager() >>> r = http.request('GET', 'http://google.com/') I get thrown the following error: Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python2.7/dist-packages/urllib3/request.py", line 65, in request **urlopen_kw

urllib2.HTTPError: HTTP Error 401 while querying using the new Bing API ( in azure marketplace )

阅读更多关于 urllib2.HTTPError: HTTP Error 401 while querying using the new Bing API ( in azure marketplace )

So, I ve made corrections based on most of the answers under the same roof in stack overflow, I'm still unable to resolve this problem. queryBingFor = "Google Fibre" quoted_query = urllib.quote(queryBingFor) account_key = "dslfkslkdfhsehwekhrwkj2187iwekjfkwej3" rootURL = "https://api.datamarket.azure.com/Bing/Search/v1/" searchURL = rootURL + "Image?format=json&Query=" + quoted_query cred = base64.encodestring(accountKey) reqBing = urllib2.Request(url=searchURL) author = "Basic %s" % cred reqBing.add_header('Authorization',author) readURL = urllib2.urlopen(reqBing) I know I'm missing out

Python: Post Request with image files

阅读更多关于 Python: Post Request with image files

I have a server and I am trying to build a post request to get the data back. I think one way to achieve this is to add the parameters in the header and make the request. But I am getting few errors that I don't understand well enough to go forward. Html Form <html> <head> <meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"> </head> <body> <form method="POST" action="http://some.server.com:61235/imgdigest" enctype="multipart/form-data"> quality:<input type="text" name="quality" value="2"><br> category:<input type="text" name="category" value="1"><br> debug:<input type="text

Python urllib2 URLError HTTP status code.

阅读更多关于 Python urllib2 URLError HTTP status code.

问题 I want to grab the HTTP status code once it raises a URLError exception: I tried this but didn't help: except URLError, e: logger.warning( 'It seems like the server is down. Code:' + str(e.code) ) 回答1: You shouldn't check for a status code after catching URLError , since that exception can be raised in situations where there's no HTTP status code available, for example when you're getting connection refused errors. Use HTTPError to check for HTTP specific errors, and then use URLError to

https proxy support in python requests library

阅读更多关于 https proxy support in python requests library

I am using the python Requests library to do HTTP related stuff. I set a proxy server using free ntlmaps on my computer to act as a proxy to answer the NTLM challenges from corporate ISA server. However, the response seems always to be empty, as shown below: >>> import requests >>> r = requests.get('https://www.google.com') >>> r.text u'<HTML></HTML>\r\n' There is no such problem in the http request though. And, when I am using urllib2 library, it can get the correct response. I compared the message difference between using 'Requests' and 'urllib2' library, and found that 'Requests' uses 'GET'

Which is best in Python: urllib2, PycURL or mechanize?

阅读更多关于 Which is best in Python: urllib2, PycURL or mechanize?

问题 Ok so I need to download some web pages using Python and did a quick investigation of my options. Included with Python: urllib - seems to me that I should use urllib2 instead. urllib has no cookie support, HTTP/FTP/local files only (no SSL) urllib2 - complete HTTP/FTP client, supports most needed things like cookies, does not support all HTTP verbs (only GET and POST, no TRACE, etc.) Full featured: mechanize - can use/save Firefox/IE cookies, take actions like follow second link, actively

Get json data via url and use in python (simplejson)

阅读更多关于 Get json data via url and use in python (simplejson)

问题 I imagine this must have a simple answer, but I am struggling: I want to take a url (which outputs json) and get the data in a usable dictionary in python. I am stuck on the last step. >>> import urllib2 >>> import simplejson >>> req = urllib2.Request("http://vimeo.com/api/v2/video/38356.json", None, {'user-agent':'syncstream/vimeo'}) >>> opener = urllib2.build_opener() >>> f = opener.open(req) >>> f.read() # this works '[{"id":"38356","title":"Forgetfulness - Billy Collins Animated Poetry",

Using BeautifulSoup to select div blocks within HTML

阅读更多关于 Using BeautifulSoup to select div blocks within HTML

I am trying to parse several div blocks using Beautiful Soup using some html from a website. However, I cannot work out which function should be used to select these div blocks. I have tried the following: import urllib2 from bs4 import BeautifulSoup def getData(): html = urllib2.urlopen("http://www.racingpost.com/horses2/results/home.sd?r_date=2013-09-22", timeout=10).read().decode('UTF-8') soup = BeautifulSoup(html) print(soup.title) print(soup.find_all('<div class="crBlock ">')) getData() I want to be able to select everything between <div class="crBlock "> and its correct end </div> .