Reverse geocoding with big array is fastest way? - javascript and performance

我的未来我决定 提交于 2019-12-01 00:41:06

🚀 Rock your calculations

Step #1 Calculate distance of all locations.

Step #2 Sort the result by the distance value

Step #3 Find locations, repeat it at least one record is found.

Check the DEMO, to calculate a list with 7320 records took ~ 17.22119140625ms

const citites = [
  [`Abano Terme (PD)`, 45.3594, 11.7894],
  [`Abbadia Cerreto (LO)`, 45.3122, 9.5928],
  [`Abbadia Lariana (LC)`, 45.8992, 9.3336],
  [`Abbadia San Salvatore (SI)`, 42.8800, 11.6775],
  [`Abbasanta (OR)`, 40.1250, 8.8200]
]

function distance(lat, long) {
  const R = 6372.795477598
  const PI = Math.PI / 180
  return cities
    .map(city => {
      const laA = city[1] * PI
      const laB = lat * PI
      const loA = city[2] * PI
      const loB = long * PI
      const dist = R * Math.acos(
        Math.sin(laA) * Math.sin(laB) +
        Math.cos(laA) * Math.cos(laB) * Math.cos(loA - loB)
      ) || 0
      return { dist, city }
    })
    .sort((a, b) => a.dist - b.dist)
}


function nearest(dist, lat, long) {
  const locations = distance(lat, long)
  function find(delta) {
    const result = []
    for (let location of locations) {
      if (location.dist > delta) break
      result.push(location.city)
    }
    return result.length > 0
      ? result
      : find(delta + 50)
  }
  return find(dist)
}

const result = nearest(50, 41.89595563, 12.48325842)

Since the city data is not dynamically changed and you need to calculate the distance / nearest neighbour frequently, using a geospatial index (KD-Tree, R-Tree etc) would make sense.

Here's an example implementation using geokdbush which is based on a static spatial index implemented using a KD-Tree. It takes Earth curvature and date line wrapping into account.

const kdbush = require('kdbush');
const geokdbush = require('geokdbush');

// I've stored the data points as objects to make the values unambiguous
const cities = [
  { name: "Abano Terme (PD)", latitude: 45.3594, longitude: 11.7894 },
  { name: "Abbadia Cerreto (LO)", latitude: 45.3122, longitude: 9.5928 },
  { name: "Abbadia Lariana (LC)", latitude: 45.8992, longitude: 9.3336 },
  { name: "Abbadia San Salvatore (SI)", latitude: 42.8800, longitude: 11.6775 },
  { name: "Abbasanta (OR)", latitude: 40.1250, longitude: 8.8200 }
];

// Create the index over city data ONCE
const index = kdbush(cities, ({ longitude }) => longitude, ({ latitude }) => latitude);

// Get the nearest neighbour in a radius of 50km for a point with latitude 43.7051 and longitude 11.4363
const nearest = geokdbush.around(index, 11.4363, 43.7051, 1, 50);

Once again, bear in mind that kdbush is a static index and cannot be changed (you cannot add or remove cities from it). If you need to change the city data after initialisation, depending on how often you do it, using an index might prove too costly.

You may want to sort after the second array element:

 citta_vicine.sort(function(a, b) {
   return a[1] - b[1];
 }); 

To get nearest city...

If you're only interested in the nearest city, you don't need to sort the whole list. That's your first performance gain in one line of code!

// Unneeded sort:
const closest = cityDistancePairs.sort((a, b) => a[1] - b[2])[0];

// Only iterates once instead:
const closestAlt = cityDistancePairs.reduce(
  (closest, current) => current[1] < closest[1] ? current : closest
);

To further optimize you'll need to benchmark which parts of the code take the longest to run. Some ideas:

  • Do a quick check on the lat/lon difference before calculating the precise value. If coordinates are further than a certain delta apart, you already know they are out of range.
  • Cache the calculated distances by implementing a memoization pattern to make sure that on a second pass with a different limit (50 -> 100), you don't recalculate the distances

However, I can't imagine that a loop of 8000 distance calculations is the real performance drain... I'm guessing that the parsing of 300kb of javascript is the real bottleneck. How are you initializing the city array?

Make sure you strip down the data set to only the properties and values you actually use. If you know you're only going to use the name and lat/lon, you can preprocess the data to only include those. That'll probably make it much smaller than 300kb and easier to work with.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!