Django Models - SELECT DISTINCT(foo) FROM table is too slow

匿名 (未验证) 提交于 2019-12-03 10:24:21

问题:

I have a MySQL table with 13M rows. I can query the db directly as

SELECT DISTINCT(refdate) FROM myTable 

The query takes 0.15 seconds and is great.

The equivalent table defined as a Django model and queried as

myTable.objects.values(`refdate`).distinct() 

takes a very long time. Is it because there are too many items in the list before distinct(). How do I do this in a manner that doesn't bring everything down?

回答1:

Thank you @solarissmoke for the pointer to connection.queries.

I was expecting to see

SELECT DISTINCT refdate FROM myTable 

Instead, I got

SELECT DISTINCT refdate, itemIndex, itemType FROM myTable ORDER BY itemIndex, refdate, itemType.  

I then looked at myTable defined in models.py.

unique_together = (('nodeIndex', 'refdate', 'nodeType'), ) ordering = ['nodeIndex', 'refdate', 'nodeType'] 

From Interaction with default ordering or order_by

normally you won’t want extra columns playing a part in the result, so clear out the ordering, or at least make sure it’s restricted only to those fields you also select in a values() call.

So I tried order_by() to flush the previously defined ordering and voila!

myTable.objects.values('refdate').order_by().distinct() 


回答2:

You can try this:

myTable.objects.all().distinct('refdate') 


易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!