urllib2 | 易学教程

Why aren't persistent connections supported by URLLib2?

阅读更多关于 Why aren't persistent connections supported by URLLib2?

问题 After scanning the urllib2 source, it seems that connections are automatically closed even if you do specify keep-alive. Why is this? As it is now I just use httplib for my persistent connections... but wonder why this is disabled (or maybe just ambiguous) in urllib2. 回答1: It's a well-known limit of urllib2 (and urllib as well). IMHO the best attempt so far to fix it and make it right is Garry Bodsworth's coda_network for Python 2.6 or 2.7 -- replacement, patched versions of urllib2 (and some

python pandas yahoo stock data error

阅读更多关于 python pandas yahoo stock data error

问题 i am try to pullout intraday aapl stock data by yahoo. but there problem i facing with my program.. import pandas as pd import datetime import urllib2 import matplotlib.pyplot as plt get = 'http://chartapi.finance.yahoo.com/instrument/1.0/aapl/chartdata;type=quote;range=1d/csv' getdata = urllib2.urlopen(get).read() df = pd.read_csv(getdata, skiprows=17, header=None) print df.head() and error is this.... Traceback (most recent call last): File "getyahoodata.py", line 10, in <module> df = pd

python知识捡拾---urllib模块及HTML文档解析

阅读更多关于 python知识捡拾---urllib模块及HTML文档解析

urllib模块可以完成的工作都可以使用urllib2来完成，当需要以比较灵活的方式访问 url资源的时候，就可以使用urllib2模块来实现 urllib2模块基本方法： fp = urllib2 . urlopen ( "http://www.baidu.com" ) print fp . read ( ) #从文件对象中读取资源 print fp . geturl ( ) print fp . info ( ) . items ( ) 使用Request类来生成request对象，然后通过使用urlopen方法来打开对象，从而实现上面的功能，当可选参数data为"None"的时候，将会使用GET方法来获取URL资源，当data不为空的时候，则使用POST方法将数据传递给URL资源 request = urllib2 . Request ( "http://www.baidu.com" , data = 'data' ) fp = urllib2 . urlopen ( request ) print fp . read ( ) urllib2中的Handler Handler包括ProxyHandler、HttpBasicAtuhHandler等众多处理模块可以通过build_opener方法来构造并安装这些处理程序

HTTPError: Not Found in urllib2 and BeautifulSoup?

阅读更多关于 HTTPError: Not Found in urllib2 and BeautifulSoup?

问题 from lxml import html import requests # Initial attempt to scrape HTML from link using BeautifulSoup obama_4427 = requests.get('http://millercenter.org/president/obama/speech-4427') obama_4427_tree = html.fromstring(obama_4427.text) # The speech text itself is stored in the HTML with an Xpath # of '//*[@id="transcript"]/p' and is a <div> obama_4427_text = obama_4427_tree.xpath('//div[@id="transcript"]/p') print(obama_4427_text) import urllib2,sys from bs4 import BeautifulSoup,NavigableString

Facebook publish HTTP Error 400 : bad request

阅读更多关于 Facebook publish HTTP Error 400 : bad request

问题 Hey I am trying to publish a score to Facebook through python's urllib2 library. import urllib2,urllib url = "https://graph.facebook.com/USER_ID/scores" data = {} data['score']=SCORE data['access_token']='APP_ACCESS_TOKEN' data_encode = urllib.urlencode(data) request = urllib2.Request(url, data_encode) response = urllib2.urlopen(request) responseAsString = response.read() I am getting this error: response = urllib2.urlopen(request) File "/System/Library/Frameworks/Python.framework/Versions/2

Python urllib2.urlopen with @ symbol in URL

阅读更多关于 Python urllib2.urlopen with @ symbol in URL

问题 I'm playing around in Python and and there's a URL that I'm trying to use which goes like this https://[username@domain.com]:[password]@domain.com/blah This is my code: response =urllib2.urlopen("https://[username@domain.com]:[password]@domain.com/blah") html = response.read() print ("data="+html) This isn't going through, it doesn't like the @ symbols and probably the : too. I tried searching, and I read something about unquote, but that's not doing anything. This is the error I get: raise

Download a binary file using Python requests module

阅读更多关于 Download a binary file using Python requests module

问题 I need to download a file from an external source, I am using Basic authentication to login to the URL import requests response = requests.get('<external url', auth=('<username>', '<password>')) data = response.json() html = data['list'][0]['attachments'][0]['url'] print (html) data = requests.get('<API URL to download the attachment>', auth=('<username>', '<password>'), stream=True) print (data.content) I am getting below output <url to download the binary data> \x00\x00\x13\x00\x00\x00\x00

How can I implement my PHP curl request to Python

阅读更多关于 How can I implement my PHP curl request to Python

问题 This PHP code below fetches html from server A to server B. I did this to circumvent the same-domain policy of browsers. (jQuery's JSONP can also be used to achieve this but I prefer this method) <?php /* This code goes inside the body tag of server-B.com. Server-A.com then returns a set of form tags to be echoed in the body tag of Server-B */ $ch = curl_init(); $url = "http://server-A.com/form.php"; curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_HEADER,FALSE); curl_exec($ch);

How to batch asynchronous web requests performed using a comprehension in python?

阅读更多关于 How to batch asynchronous web requests performed using a comprehension in python?

问题 not sure if this is possible, spend some time looking at what seem like similar questions, but still unclear. For a list of website urls, I need to get the html as a starting point. I have a class that contains a list of these urls and the class returns a custom iterator that helps me iterate through these to get the html (simplified below) class Url: def __init__(self, url) self.url = url def fetchhtml(self) import urllib2 response = urllib2.urlopen(self.url) return response.read() class

Using multiple proxies to open a link in urllib2

阅读更多关于 Using multiple proxies to open a link in urllib2

问题 What i am trying to do is read a line(an ip address), open the website with that address, and then repeat with all the addresses in the file. instead, i get an error. I am new to python, so maybe its a simple mistake. Thanks in advance !!! CODE: >>> f = open("proxy.txt","r"); #file containing list of ip addresses >>> address = (f.readline()).strip(); # to remove \n at end of line >>> >>> while line: proxy = urllib2.ProxyHandler({'http': address }) opener = urllib2.build_opener(proxy) urllib2