Extract data from JSON from an API with Python

╄→尐↘猪︶ㄣ 提交于 2019-12-25 17:50:36

问题


The data under consideration is coming from an API, which means that it's highly inconsistent- sometimes it pulls unexpected content, sometimes it pulls nothing, etc.

What I'm interested in is the data associated with ISO 3166-2 for each record.

The data (when it doesn't encounter an error) generally looks something like this:

{"countryCode": "GB", "adminCode1": "ENG", "countryName": "United Kingdom", "distance": 0, "codes": [{"type": "ISO3166-2", "code": "ENG"}], "adminName1": "England"}
{"countryCode": "GB", "adminCode1": "ENG", "countryName": "United Kingdom", "distance": 0, "codes": [{"type": "ISO3166-2", "code": "ENG"}], "adminName1": "England"}
{"countryCode": "GB", "adminCode1": "ENG", "countryName": "United Kingdom", "distance": 0, "codes": [{"type": "ISO3166-2", "code": "ENG"}], "adminName1": "England"}
{"countryCode": "GB", "adminCode1": "ENG", "countryName": "United Kingdom", "distance": 0, "codes": [{"type": "ISO3166-2", "code": "ENG"}], "adminName1": "England"}
{"countryCode": "RO", "adminCode1": "10", "countryName": "Romania", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "10"}, {"type": "ISO3166-2", "code": "B"}], "adminName1": "Bucure\u015fti"}
{"countryCode": "DE", "adminCode1": "07", "countryName": "Germany", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "07"}, {"type": "ISO3166-2", "code": "NW"}], "adminName1": "North Rhine-Westphalia"}
{"countryCode": "DE", "adminCode1": "01", "countryName": "Germany", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "01"}, {"type": "ISO3166-2", "code": "BW"}], "adminName1": "Baden-W\u00fcrttemberg"}
{"countryCode": "DE", "adminCode1": "02", "countryName": "Germany", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "02"}, {"type": "ISO3166-2", "code": "BY"}], "adminName1": "Bavaria"}

Let's take one record for example:

{"countryCode": "DE", "adminCode1": "01", "countryName": "Germany", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "01"}, {"type": "ISO3166-2", "code": "BW"}], "adminName1": "Baden-W\u00fcrttemberg"}

From this I'm interested to extract the ISO 3166-2 representation, i.e. DE-BW.

I've been trying different ways of extracting this information with python, one attempt looked like this:

coord = response.get('codes', {}).get('type', {}).get('ISO3166-2', None)

another attempt looked like this:

print(json.dumps(response["codes"]["ISO3166-2"]))

However neither of those methods worked.

How can I take a record such as:

{"countryCode": "DE", "adminCode1": "01", "countryName": "Germany", "distance": 0, "codes": [{"type": "FIPS10-4", "code": "01"}, {"type": "ISO3166-2", "code": "BW"}], "adminName1": "Baden-W\u00fcrttemberg"}

and extract only DE-BW using python, while simultaneously controlling for instances that don't look exactly like that, for instance also extracting GB-ENG from:

{"countryCode": "GB", "adminCode1": "ENG", "countryName": "United Kingdom", "distance": 0, "codes": [{"type": "ISO3166-2", "code": "ENG"}], "adminName1": "England"}

and of course not crashing if it gets something that doesn't look like either of those, i.e. exception handling.


FULL FILE

import json
import requests
from collections import defaultdict
from pprint import pprint

# open up the output of 'data-processing.py'
with open('job-numbers-by-location.txt') as data_file:

    for line in data_file:
        identifier, name, coords, number_of_jobs = line.split("|")
        coords = coords[1:-1]
        lat, lng = coords.split(",")
        # print("lat: " + lat, "lng: " + lng)
        response = requests.get("http://api.geonames.org/countrySubdivisionJSON?lat="+lat+"&lng="+lng+"&username=s.matthew.english").json()


        codes = response.get('codes', [])
        for code in codes:
            if code.get('type') == 'ISO3166-2':
                print('{}-{}'.format(response.get('countryCode', 'UNKNOWN'), code.get('code', 'UNKNOWN'))

回答1:


'ISO3166-2' is dictionary value, not key

codes = response.get('codes', [])
for code in codes:
    if code.get('type') == 'ISO3166-2':
        print('{}-{}'.format(response.get('countryCode', 'UNKNOWN'), code.get('code', 'UNKNOWN')))


来源:https://stackoverflow.com/questions/40914419/extract-data-from-json-from-an-api-with-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!