I am trying to get the data from this website: http://www.boursorama.com/includes/cours/last_transactions.phtml?symbole=1xEURUS
It seems like urlopen don\'t get the
Personally , I write:
# Python 2.7
import urllib
url = 'http://www.boursorama.com/includes/cours/last_transactions.phtml?symbole=1xEURUS'
sock = urllib.urlopen(url)
content = sock.read()
sock.close()
print content
Et si tu parles français,.. bonjour sur stackoverflow.com !
In fact, I prefer now to employ the following code, because it is faster:
# Python 2.7
import httplib
conn = httplib.HTTPConnection(host='www.boursorama.com',timeout=30)
req = '/includes/cours/last_transactions.phtml?symbole=1xEURUS'
try:
conn.request('GET',req)
except:
print 'echec de connexion'
content = conn.getresponse().read()
print content
Changing httplib to http.client in this code should be enough to adapt it to Python 3.
.
I confirm that, with these two codes, I obtain the source code in which I see the data in which you are interested:
11:57:44
1.4486
0
11:57:43
1.4486
0
Adding the following snippet to the above code will allow you to extract the data I suppose you want:
for i,line in enumerate(content.splitlines(True)):
print str(i)+' '+repr(line)
print '\n\n'
import re
regx = re.compile('\t\t\t\t\t\t(\d\d:\d\d:\d\d) \r\n'
'\t\t\t\t\t\t([\d.]+) \r\n'
'\t\t\t\t\t\t(\d+) \r\n')
print regx.findall(content)
result (only the end)
.......................................
.......................................
.......................................
.......................................
98 'window.config.graphics = {};\n'
99 'window.config.accordions = {};\n'
100 '\n'
101 "window.addEvent('domready', function(){\n"
102 '});\n'
103 '\n'
104 '\n'
114 '\n'
128 '\n' 129 '