问题
The data under consideration is coming from an API, which means that it's highly inconsistent- sometimes it pulls unexpected content, sometimes it pulls nothing, etc.
What I'm interested in is the data associated with ISO 3166-2 for each record.
The data (when it doesn't encounter an error) generally looks something like this:
{"countryCode": "GB", "adminCode1": "ENG", "countryName": "United Kingdom", "distance": 0, "codes": [{"type": "ISO3166-2", "code": "ENG"}], "adminName1": "England"}
{"countryCode": "GB", "adminCode1": "ENG", "countryName": "United Kingdom", "distance": 0, "codes": [{"type": "ISO3166-2", "code": "ENG"}], "adminName1": "England"}
{"countryCode": "GB", "adminCode1": "ENG", "countryName": "United Kingdom", "distance": 0, "codes": [{"type": "ISO3166-2", "code": "ENG"}], "adminName1": "England"}
{"countryCode": "GB", "adminCode1": "ENG", "countryName": "United Kingdom", "distance": 0, "codes": [{"type": "ISO3166-2", "code": "ENG"}], "adminName1": "England"}
{"countryCode": "RO", "adminCode1": "10", "countryName": "Romania", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "10"}, {"type": "ISO3166-2", "code": "B"}], "adminName1": "Bucure\u015fti"}
{"countryCode": "DE", "adminCode1": "07", "countryName": "Germany", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "07"}, {"type": "ISO3166-2", "code": "NW"}], "adminName1": "North Rhine-Westphalia"}
{"countryCode": "DE", "adminCode1": "01", "countryName": "Germany", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "01"}, {"type": "ISO3166-2", "code": "BW"}], "adminName1": "Baden-W\u00fcrttemberg"}
{"countryCode": "DE", "adminCode1": "02", "countryName": "Germany", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "02"}, {"type": "ISO3166-2", "code": "BY"}], "adminName1": "Bavaria"}
Let's take one record for example:
{"countryCode": "DE", "adminCode1": "01", "countryName": "Germany", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "01"}, {"type": "ISO3166-2", "code": "BW"}], "adminName1": "Baden-W\u00fcrttemberg"}
From this I'm interested to extract the ISO 3166-2
representation, i.e. DE-BW
.
I've been trying different ways of extracting this information with python, one attempt looked like this:
coord = response.get('codes', {}).get('type', {}).get('ISO3166-2', None)
another attempt looked like this:
print(json.dumps(response["codes"]["ISO3166-2"]))
However neither of those methods worked.
How can I take a record such as:
{"countryCode": "DE", "adminCode1": "01", "countryName": "Germany", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "01"}, {"type": "ISO3166-2", "code": "BW"}], "adminName1": "Baden-W\u00fcrttemberg"}
and extract only DE-BW
using python, while simultaneously controlling for instances that don't look exactly like that, for instance also extracting GB-ENG
from:
{"countryCode": "GB", "adminCode1": "ENG", "countryName": "United Kingdom", "distance": 0, "codes": [{"type": "ISO3166-2", "code": "ENG"}], "adminName1": "England"}
and of course not crashing if it gets something that doesn't look like either of those, i.e. exception handling.
FULL FILE
import json
import requests
from collections import defaultdict
from pprint import pprint
# open up the output of 'data-processing.py'
with open('job-numbers-by-location.txt') as data_file:
for line in data_file:
identifier, name, coords, number_of_jobs = line.split("|")
coords = coords[1:-1]
lat, lng = coords.split(",")
# print("lat: " + lat, "lng: " + lng)
response = requests.get("http://api.geonames.org/countrySubdivisionJSON?lat="+lat+"&lng="+lng+"&username=s.matthew.english").json()
codes = response.get('codes', [])
for code in codes:
if code.get('type') == 'ISO3166-2':
print('{}-{}'.format(response.get('countryCode', 'UNKNOWN'), code.get('code', 'UNKNOWN'))
回答1:
'ISO3166-2' is dictionary value, not key
codes = response.get('codes', [])
for code in codes:
if code.get('type') == 'ISO3166-2':
print('{}-{}'.format(response.get('countryCode', 'UNKNOWN'), code.get('code', 'UNKNOWN')))
来源:https://stackoverflow.com/questions/40914419/extract-data-from-json-from-an-api-with-python