django queryset aggregation count counting wrong thing

大兔子大兔子 提交于 2019-12-12 18:31:25

问题


This is a continuation question from:
Django queryset get distinct column values with respect to other column

My Problem:

Using aggregate count in Django counts the wrong thing in the queryset, or as far as I can see something that is not even in my queryset.

What I did I used:

queryset.order_by('col1', 'col2').distinct('col1', 'col2').values('col2')

to get the values of col2 of a model where all the rows have a distinct pair of values in (col1, col2). There is an example in the link above. I printed my queryset and it looks good, I have

[{'col2': value1}, ... , {'col2': value1},{'col2': value2}, ..., {'col2': value2},...]

I now want to count how much each value appears in the queryset I got. I do this using aggregation count. I have:

a = {'count': Count(F(col2), distinct=False)}
queryset.annotate(**a)

I tried this with ``distinct=True` as well but no luck

I would expect to get [{col2:value1, count: num1}, {col2:value2, count: num2} ...].
Instead I get [{col2: value1, count: num11}, {col2: value1, count: num12}, ... ,{col2: value1, count: num_1n}, {col2: value2, count: num21}, ... ,{col2: value1, count: num_2n}, ...]. Where as far as I can tell num11, ...., num_1n are the amount of lines value1 existed in col2 with any specific value in col1, previous to me using order_by('col1', 'col2').distinct('col1', 'col2').values('col2') when preparing the query.

I can't figure out what can cause this. I tried looking in the queryset.query parameter but I can't understand where I am going wrong.

Any help would be greatly appreciated.


回答1:


The .order_by should specify only 'col2', like:

queryset.values('col2').annotate(
    count=Count('col1', distinct=True)
).order_by('col2')

This will thus yield a QuerySet that looks like:

< QuerySet [
    {'col2': 1, 'count': 4 },
    {'col2': 2, 'count': 2 }
]>

So that means that there are four distinct values for col1 given col2 has value 1, and two distinct values for col1 given col2 has value 2.

This will construct a query like:

SELECT col2, COUNT(DISTINCT col1) AS count
FROM some_table
GROUP BY col2
ORDER BY col2

The .distinct(..) is here not necessary since due to the GROUP BY we will only obtain distinct col2 values, and because we COUNT(DISTINCT ..) this thus means that each distinct value of col1 is counted once.



来源:https://stackoverflow.com/questions/52907276/django-queryset-aggregation-count-counting-wrong-thing

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!