urllib2

Set Host-header when using Python and urllib2

怎甘沉沦 提交于 2019-12-07 00:41:29
问题 I'm using my own resolver and would like to use urllib2 to just connect to the IP (no resolving in urllib2) and I would like set the HTTP Host-header myself. But urllib2 is just ignoring my Host-header: txheaders = { 'User-Agent': UA, "Host: ": nohttp_url } robots = urllib2.Request("http://" + ip + "/robots.txt", txdata, txheaders) 回答1: You have included ": " in the "Host" string. txheaders = { "User-Agent": UA, "Host": nohttp_url } robots = urllib2.Request("http://" + ip + "/robots.txt",

Python's urllib2.urlopen() hanging with local connection to a Java Restlet server

大城市里の小女人 提交于 2019-12-07 00:13:31
问题 I'm trying to connect to a local running Restlet server from python, but the connection hangs infinitely (or times out if I set a timeout). import urllib2 handle = urllib2.urlopen("http://localhost:8182/contact/123") # hangs If I use curl from a shell to open the above URL, the results return quickly. If I use urllib2 to open a different local service (e.g. a Django web server on port 8000), urllib2 works fine. I've tried disabling firewall (I'm doing this on OS X). I've tried changing

python 爬虫抓取心得分享

烈酒焚心 提交于 2019-12-06 23:43:28
/** author: insun title:python 爬虫抓取心得分享 blog:http://yxmhero1989.blog.163.com/blog/static/112157956201311821444664/ **/ 0x1.urllib.quote('要编码的字符串') 如果你要在url请求里面放入中文,对相应的中文进行编码的话,可以用: urllib.quote('要编码的字符串') query = urllib.quote(singername) url = 'http://music.baidu.com/search?key='+query response = urllib.urlopen(url) text = response.read() 0x2. get or post urlencode 如果在GET需要一些参数的话,那我们需要对传入的参数进行编码。 import urllib def url_get(): import urllib params = urllib.urlencode({'spam': 1, 'eggs': 2, 'bacon': 0}) f = urllib.urlopen("http://www.musi-cal.com/cgi-bin/query?%s" % params) print f.read() def url

How to make api call that requires login in web2py?

随声附和 提交于 2019-12-06 22:48:37
I want to access APIs from application. Those APIs has decorator @auth.requires_login() . I am calling api from controller using demo_app/controllers/plugin_task/task url = request.env.http_origin + URL('api', 'bind_task') page = urllib2.Request(url) page.add_header('cookie', request.env.http_cookie) response = urllib2.urlopen(page) Demo API api.py @auth.requires_login() @request.restful() def bind_task(): response.view = 'generic.json' return dict(GET=_bind_task) def _bind_task(**get_params): return json.dumps({'status': '200'}) Above code gives me error : HTTPError: HTTP Error 401:

Python: Login on a website

好久不见. 提交于 2019-12-06 19:37:25
I trying to login on a website and do automated clean-up jobs. The site where I need to login is : http://site.com/Account/LogOn I tried various codes that I found it on Stack, like Login to website using python (but Im stuck on this line session = requests.session(config={'verbose': sys.stderr}) where my JetBeans doesnt like 'verbose' telling me that i need to do something, but doesnt explain exactly what). I also tried this: Browser simulation - Python , but no luck with this too. Can anyone help me? All answers will be appreciate. Thanks in advance. PS: I started learning Python 2 weeks ago

unable to send data using urllib and urllib2 (python)

送分小仙女□ 提交于 2019-12-06 15:42:54
Hello everybody (first post here). I am trying to send data to a webpage. This webpage request two fields (a file and an e-mail address) if everything is ok the webpage returns a page saying "everything is ok" and sends a file to the provided e-mail address. I execute the code below and I get nothing in my e-mail account. import urllib, urllib2 params = urllib.urlencode({'uploaded': open('file'),'email': 'user@domain.com'}) req = urllib2.urlopen('http://webpage.com', params) print req.read() the print command gives me the code of the home page (I assume instead it should give the code of the

trying to split the file download buffer to into separate threads

人走茶凉 提交于 2019-12-06 14:36:32
I am trying to download the buffer of file into 5 threads but it seems like it's getting garbled. from numpy import arange import requests from threading import Thread import urllib2 url = 'http://pymotw.com/2/urllib/index.html' sizeInBytes = r = requests.head(url, headers={'Accept-Encoding': 'identity'}).headers['content-length'] splitBy = 5 splits = arange(splitBy + 1) * (float(sizeInBytes)/splitBy) dataLst = [] def bufferSplit(url, idx, splits): req = urllib2.Request(url, headers={'Range': 'bytes=%d-%d' % (splits[idx], splits[idx+1])}) print {'bytes=%d-%d' % (splits[idx], splits[idx+1])}

cURL: https through a proxy

这一生的挚爱 提交于 2019-12-06 13:40:38
问题 I need to make a cURL request to a https URL, but I have to go through a proxy as well. Is there some problem with doing this? I have been having so much trouble doing this with curl and php, that I tried doing it with urllib2 in Python, only to find that urllib2 cannot POST to https when going through a proxy. I haven't been able to find any documentation to this effect with cURL, but I was wondering if anyone knew if this was an issue? 回答1: I find testing with command-line curl a big help

Howto use python urllib2 to create service POST with bitbucket API?

老子叫甜甜 提交于 2019-12-06 13:31:37
问题 While this code works fine to add a deployment ssh-key to my repos... print 'Connecting to Bitbucket...' bitbucket_access = base64.b64encode(userbb + ":" + passwordbb) bitbucket_headers = {"Content-Type":"application/json", "Authorization":"Basic " + bitbucket_access} bitbucket_request_url = "https://bitbucket.org/api/1.0/repositories/<username>/%s/deploy-keys" % project_name bitbucket_request_req = urllib2.Request(bitbucket_request_url) for key,value in bitbucket_headers.items(): bitbucket

How to reliably process web-data in Python

戏子无情 提交于 2019-12-06 13:16:51
问题 I'm using the following code to get data from a website: time_out = 4 def tryconnect(turl, timer=time_out, retries=10): urlopener = None sitefound = 1 tried = 0 while (sitefound != 0) and tried < retries: try: urlopener = urllib2.urlopen(turl, None, timer) sitefound = 0 except urllib2.URLError: tried += 1 if urlopener: return urlopener else: return None [...] urlopener = tryconnect('www.example.com') if not urlopener: return None try: for line in urlopener: do stuff except httplib