R: How to GeoCode a simple address using Data Science Toolbox

后端 未结 2 446
Happy的楠姐
Happy的楠姐 2020-11-28 10:03

I am fedup with Google\'s geocoding, and decided to try an alternative. The Data Science Toolkit (http://www.datasciencetoolkit.org) allows you to Geocode unlimited number o

相关标签:
2条回答
  • 2020-11-28 10:33

    Like this?

    library(httr)
    library(rjson)
    
    data <- paste0("[",paste(paste0("\"",dff$address,"\""),collapse=","),"]")
    url  <- "http://www.datasciencetoolkit.org/street2coordinates"
    response <- POST(url,body=data)
    json     <- fromJSON(content(response,type="text"))
    geocode  <- do.call(rbind,sapply(json,
                                     function(x) c(long=x$longitude,lat=x$latitude)))
    geocode
    #                                                long      lat
    # San Francisco, California, United States -117.88536 35.18713
    # Mobile, Alabama, United States            -88.10318 30.70114
    # La Jolla, California, United States      -117.87645 33.85751
    # Duarte, California, United States        -118.29866 33.78659
    # Little Rock, Arkansas, United States      -91.20736 33.60892
    # Tucson, Arizona, United States           -110.97087 32.21798
    # Redwood City, California, United States  -117.88536 35.18713
    # New Haven, Connecticut, United States     -72.92751 41.36571
    # Berkeley, California, United States      -122.29673 37.86058
    # Hartford, Connecticut, United States      -72.76356 41.78516
    # Sacramento, California, United States    -121.55541 38.38046
    # Encinitas, California, United States     -116.84605 33.01693
    # Birmingham, Alabama, United States        -86.80190 33.45641
    # Stanford, California, United States      -122.16750 37.42509
    # Orange, California, United States        -117.85311 33.78780
    # Los Angeles, California, United States   -117.88536 35.18713
    

    This takes advantage of the POST interface to the street2coordinates API (documented here), which returns all the results in 1 request, rather than using multiple GET requests.

    EDIT (Response to OP's comment)

    The absence of Phoenix seems to be a bug in the street2coordinates API. If you go the API demo page and try "Phoenix, Arizona, United States", you get a null response. However, as your example shows, using their "Google-style Geocoder" does give a result for Phoenix. So here's a solution using repeated GET requests. Note that this runs much slower.

    geo.dsk <- function(addr){ # single address geocode with data sciences toolkit
      require(httr)
      require(rjson)
      url      <- "http://www.datasciencetoolkit.org/maps/api/geocode/json"
      response <- GET(url,query=list(sensor="FALSE",address=addr))
      json <- fromJSON(content(response,type="text"))
      loc  <- json['results'][[1]][[1]]$geometry$location
      return(c(address=addr,long=loc$lng, lat= loc$lat))
    }
    result <- do.call(rbind,lapply(as.character(dff$address),geo.dsk))
    result <- data.frame(result)
    result
    #                                     address         long        lat
    # 1        Birmingham, Alabama, United States   -86.801904  33.456412
    # 2            Mobile, Alabama, United States   -88.103184  30.701142
    # 3           Phoenix, Arizona, United States -112.0733333 33.4483333
    # 4            Tucson, Arizona, United States  -110.970869  32.217975
    # 5      Little Rock, Arkansas, United States   -91.207356  33.608922
    # 6       Berkeley, California, United States   -122.29673  37.860576
    # 7         Duarte, California, United States  -118.298662  33.786594
    # 8      Encinitas, California, United States  -116.846046  33.016928
    # 9       La Jolla, California, United States  -117.876447  33.857515
    # 10   Los Angeles, California, United States  -117.885359  35.187133
    # 11        Orange, California, United States  -117.853112  33.787795
    # 12  Redwood City, California, United States  -117.885359  35.187133
    # 13    Sacramento, California, United States  -121.555406  38.380456
    # 14 San Francisco, California, United States  -117.885359  35.187133
    # 15      Stanford, California, United States    -122.1675   37.42509
    # 16     Hartford, Connecticut, United States   -72.763564   41.78516
    # 17    New Haven, Connecticut, United States   -72.927507  41.365709
    
    0 讨论(0)
  • 2020-11-28 10:47

    The ggmap package includes support for geocoding using either Google or Data Science Toolkit, the latter with their "Google-style geocoder". This is quite slow for multiple addresses, as noted in the earlier answer.

    library(ggmap)
    result <- geocode(as.character(dff[[1]]), source = "dsk")
    print(cbind(dff, result))
    #                                     address        lon      lat
    # 1        Birmingham, Alabama, United States  -86.80190 33.45641
    # 2            Mobile, Alabama, United States  -88.10318 30.70114
    # 3           Phoenix, Arizona, United States -112.07404 33.44838
    # 4            Tucson, Arizona, United States -110.97087 32.21798
    # 5      Little Rock, Arkansas, United States  -91.20736 33.60892
    # 6       Berkeley, California, United States -122.29673 37.86058
    # 7         Duarte, California, United States -118.29866 33.78659
    # 8      Encinitas, California, United States -116.84605 33.01693
    # 9       La Jolla, California, United States -117.87645 33.85751
    # 10   Los Angeles, California, United States -117.88536 35.18713
    # 11        Orange, California, United States -117.85311 33.78780
    # 12  Redwood City, California, United States -117.88536 35.18713
    # 13    Sacramento, California, United States -121.55541 38.38046
    # 14 San Francisco, California, United States -117.88536 35.18713
    # 15      Stanford, California, United States -122.16750 37.42509
    # 16     Hartford, Connecticut, United States  -72.76356 41.78516
    # 17    New Haven, Connecticut, United States  -72.92751 41.36571
    
    0 讨论(0)
提交回复
热议问题