Is there a library for parsing US addresses?

后端 未结 7 685
伪装坚强ぢ
伪装坚强ぢ 2021-01-30 11:54

I have a list of US addresses I need to break into city,state, zip code,state etc.

example address : \"16100 Sand Canyon Avenue, Suite 380 Irvine, CA 92618\"

Do

7条回答
  •  误落风尘
    2021-01-30 12:20

    I know this is an old post but someone might find it useful: https://usaddress.readthedocs.io/en/latest/

    >>> import usaddress
    >>> usaddress.parse('Robie House, 5757 South Woodlawn Avenue, Chicago, IL 60637')
    [('Robie', 'BuildingName'),
    ('House,', 'BuildingName'),
    ('5757', 'AddressNumber'),
    ('South', 'StreetNamePreDirectional'),
    ('Woodlawn', 'StreetName'),
    ('Avenue,', 'StreetNamePostType'),
    ('Chicago,', 'PlaceName'),
    ('IL', 'StateName'),
    ('60637', 'ZipCode')]
    

    Or:

    >>> import usaddress
    >>> usaddress.tag('Robie House, 5757 South Woodlawn Avenue, Chicago, IL 60637')
    (OrderedDict([
       ('BuildingName', 'Robie House'),
       ('AddressNumber', '5757'),
       ('StreetNamePreDirectional', 'South'),
       ('StreetName', 'Woodlawn'),
       ('StreetNamePostType', 'Avenue'),
       ('PlaceName', 'Chicago'),
       ('StateName', 'IL'),
       ('ZipCode', '60637')]),
    'Street Address')
    
    >>> usaddress.tag('State & Lake, Chicago')
    (OrderedDict([
       ('StreetName', 'State'),
       ('IntersectionSeparator', '&'),
       ('SecondStreetName', 'Lake'),
       ('PlaceName', 'Chicago')]),
    'Intersection')
    
    >>> usaddress.tag('P.O. Box 123, Chicago, IL')
    (OrderedDict([
       ('USPSBoxType', 'P.O. Box'),
       ('USPSBoxID', '123'),
       ('PlaceName', 'Chicago'),
       ('StateName', 'IL')]),
    'PO Box')
    

提交回复
热议问题