How efficient is it to order by distance (entire table) in geodjango

北战南征 提交于 2019-12-01 06:05:07

问题


Assume that I have the following data model

Person(models.Model):
    id       = models.BigAutoField(primary_key=True)
    name     = models.CharField(max_length=50)
    location = models.PointField(srid=4326)

Assume also that I have an app that makes queries to this django backend, and the only purpose of this app is to return a (paginated) list of registered users from closest to farthest.

Currently I have this query in mind:

# here we are obtaining all users in ordered form
current_location = me.location
people = Person.objects.distance(current_location).order_by('distance')

# here we are obtaining the first X through pagination
start_index = a
end_index = b

people = people[a:b]

Although this works, it is not as fast as I would like.

I have some concerns over the speed of this query. If the table were large (1 million+) then wouldn't the database (Postgres SQL w/ PostGIS) have to measure the distance between current_location and every location in the database before performing an order_by on that subsequently 1 million rows?

Can somebody suggest on how to properly return nearby users ordered by distance in an efficient manner?


回答1:


If you want to sort every entry on that table by distance then it will be slow as expected and there is nothing that can be done (that I am aware of at this point of time and my knowledge.)!

You can make your calculation more efficient by following this steps and making some assumptions:

  1. Enable spatial indexing on your tables. To do that in GeoDjango, follow the doc instructions and fit them to your model:

    Note

    In PostGIS, ST_Distance_Sphere does not limit the geometry types geographic distance queries are performed with. [4] However, these queries may take a long time, as great-circle distances must be calculated on the fly for every row in the query. This is because the spatial index on traditional geometry fields cannot be used.

    For much better performance on WGS84 distance queries, consider using geography columns in your database instead because they are able to use their spatial index in distance queries. You can tell GeoDjango to use a geography column by setting geography=True in your field definition.

  2. Now you can narrow down your query with some logical constrains:

    Ex: My user will not look for people more than 50km from his current position.

  3. Narrow down the search using dwithin spatial lookup which utilizes the above mentioned spatial indexing, therefore it is pretty fast.

  4. Finally apply the distance order by on the remaining rows.

The final query can look like this:

current_location = me.location
people = People.objects.filter(
    location__dwithin=(current_location, D(km=50))
).annotate(
    distance=Distance('location', current_location)
).order_by('distance')

P.S: Rather than creating a custom pagination attempt, it is more efficient to utilize the pagination methods provided for the django views:

  • Docs

Or you can use Django Rest Framework and use it's pagination:

  • Docs and a DRF pagination Q&A example


来源:https://stackoverflow.com/questions/45383792/how-efficient-is-it-to-order-by-distance-entire-table-in-geodjango

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!