when I do curl to a API call link http://example.com/passkey=wedsmdjsjmdd
curl 'http://example.com/passkey=wedsmdjsjmdd'
I get the employee output data on a csv file format, like:
"Steve","421","0","421","2","","","","","","","","","421","0","421","2"
how can parse through this using python.
I tried:
import csv
cr = csv.reader(open('http://example.com/passkey=wedsmdjsjmdd',"rb"))
for row in cr:
print row
but it didn't work and I got an error
http://example.com/passkey=wedsmdjsjmdd No such file or directory:
Thanks!
You need to replace open with urllib.urlopen or urllib2.urlopen.
e.g.
import csv
import urllib2
url = 'http://winterolympicsmedals.com/medals.csv'
response = urllib2.urlopen(url)
cr = csv.reader(response)
for row in cr:
print row
This would output the following
Year,City,Sport,Discipline,NOC,Event,Event gender,Medal
1924,Chamonix,Skating,Figure skating,AUT,individual,M,Silver
1924,Chamonix,Skating,Figure skating,AUT,individual,W,Gold
...
Using pandas it is very simple to read a csv file directly from a url
import pandas as pd
data = pd.read_csv('https://example.com/passkey=wedsmdjsjmdd')
This will read your data in tabular format, which will be very easy to process
You could do it with the requests module as well:
url = 'http://winterolympicsmedals.com/medals.csv'
r = requests.get(url)
text = r.iter_lines()
reader = csv.reader(text, delimiter=',')
To increase performance when downloading a large file, the below may work a bit more efficiently:
import requests
from contextlib import closing
import csv
url = "http://download-and-process-csv-efficiently/python.csv"
with closing(requests.get(url, stream=True)) as r:
reader = csv.reader(r.iter_lines(), delimiter=',', quotechar='"')
for row in reader:
# Handle each row here...
print row
By setting stream=True in the GET request, when we pass r.iter_lines() to csv.reader(), we are passing a generator to csv.reader(). By doing so, we enable csv.reader() to lazily iterate over each line in the response with for row in reader.
This avoids loading the entire file into memory before we start processing it, drastically reducing memory overhead for large files.
来源:https://stackoverflow.com/questions/16283799/how-to-read-a-csv-file-from-a-url-with-python
