Geopy too slow - timeout all the time

拈花ヽ惹草 提交于 2021-01-28 08:07:33

问题


I am using geopy to get latitude - longitude pairs for city names. For single queries, this works fine. What I try to do now is iterating through a big list of city names (46.000) and getting geocodes for each city. Afterwards, I run them through a check loop which sorts the city (if it is in the US) in the correct state. My problem is, that I get "GeocoderTimedOut('Service timed out')" all the time, everything is pretty slow and I'm not sure if that is my fault or just geopys nature. Here is the responsible code snippet:

for tweetcount in range(number_of_tweets):

#Get the city name from the tweet
city = data_dict[0]['tweetList'][tweetcount]['user']['location']

#Sort out useless tweets
if(len(city)>3 and not(city is None)): 

    # THE RESPONSIBLE LINE, here the error occurs
    location = geolocator.geocode(city);

    # Here the sorting into the state takes place
    if location is not None:
        for statecount in range(len(data)):
            if point_in_poly(location.longitude, location.latitude, data[statecount]['geometry']):

                state_tweets[statecount] += 1;
                break;

Somehow, this one line throws timeouts at every 2./3. call. City has the form of "Manchester", "New York, New York" or something similar. I already had try - except blocks around everything, but that doesn't really change anything about the problem, so I removed them for now... Any ideas would be great!


回答1:


You will be at the mercy of whatever geolocator service you are using. geopy is just a wrapper around different web-services and hence may fail if the server is busy. I would create a wrapper around the geolocator.geocode call, something like this:

def geocode(city, recursion=0):
    try:
        return geolocator.geocode(city)
    except GeocoderTimedOut as e:
        if recursion > 10:      # max recursions
            raise e

        time.sleep(1) # wait a bit
        # try again
        return geocode(city, recursion=recursion + 1)

This will try again 10 times, after a delay of 1 second. Adjust these numbers to your liking.

If you repeatably ask for the same city, you should consider wrapping it in some kind of memoizing e.g. this decorator. Since you have not posted a runnable code, I have not been able to test this.




回答2:


You should change your line :

location = geolocator.geocode(city);

to

location = geolocator.geocode(city,timeout=None);


来源:https://stackoverflow.com/questions/31506272/geopy-too-slow-timeout-all-the-time

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!