Django Aggregate Query Include Zero Count

末鹿安然 提交于 2021-02-11 16:11:27

问题


In my Django application, I'm trying to get a Count of all Student submitted Papers, including students who have submitted NO papers (represented as count=0).

models.py

class Student(models.Model):
   idstudent = models.AutoField(primary_key=True)
   student_name = models.CharField(max_length=250, null=False, blank=False, verbose_name='Student Name')

class Paper(models.Model):
   idpaper = models.AutoField(primary_key=True)
   student = models.ForeignKey(Student, on_delete=models.PROTECT, null=False, blank=False)

Query Attempt 1: Returns only Students who have submitted Papers

papers = Paper.objects.order_by('submission_date')
result = papers.values('student', student_name=F('student__student_name')).annotate(count=Count('student')).distinct().order_by('-count')
print(result)       

<QuerySet [{'idstudent': 1, 'student_name': '\nMichael Jordan\n', 'count': 4}, {'idstudent': 2, 'student_name': '\nSteve White\n', 'count': 2}, {'idstudent': 3, 'student_name': '\nHillary Clinton\n', 'count': 1}]>

Query Attempt 2: Returns Students who have submitted 0 Papers, but the Count for every other Student is 1

result = Student.objects.values('pk', student_name=F('student_name'))
    .annotate(
        count=Count(
            'pk',
            filter=Q(pk__in=Paper.objects.values('student')
            )
        )
    )
).order_by('-count')
print(result)

<QuerySet [{'idstudent': 1, 'student_name': '\nMichael Jordan\n', 'count': 1}, {'idstudent': 2, 'student_name': '\nSteve White\n', 'count': 1}, {'idstudent': 3, 'student_name': '\nHillary Clinton\n', 'count': 1}, , {'idstudent': 4, 'student_name': '\nDoug Funny\n', 'count': 0}, , {'idstudent': 5, 'student_name': '\nSkeeter Valentine\n', 'count': 0}]>

Along the same lines as Attempt 2, I also tried the following using Sum(Case( which yielded the same result, as I recognized that the Attempt 2 raw SQL actually utilizes Case(When, but seems to only count when Student.pk is present in the Paper.objects.values "list" (while not accounting for how many times it is present).

result = Student.objects.values('pk', student_name=F('student_name')).annotate(
    count=Sum(
        Case(
            When(pk__in=Paper.objects.values('student'), then=1),
            default=0, output_field=IntegerField()
        )
    )
)

<QuerySet [{'idstudent': 1, 'student_name': '\nMichael Jordan\n', 'count': 1}, {'idstudent': 2, 'student_name': '\nSteve White\n', 'count': 1}, {'idstudent': 3, 'student_name': '\nHillary Clinton\n', 'count': 1}, , {'idstudent': 4, 'student_name': '\nDoug Funny\n', 'count': 0}, , {'idstudent': 5, 'student_name': '\nSkeeter Valentine\n', 'count': 0}]>

How might I adjust my query to include students who have submitted 0 papers while also maintaining the correct counts for students who have?


回答1:


Along the same lines as Attempt 2, I also tried the following using Sum(Case( which yielded the same result, as I recognized that the Attempt 2 raw SQL actually utilizes Case(When, but seems to only count when Student.pk is present in the Paper.objects.values "list" (while not accounting for how many times it is present).

Either I'm not understanding the problem/question, but your Attempt 2 example is filtering the count to only Paper.objects.values "list", its normal to act like this ?

Have you tried with the simple:

Student.objects.annotate(num_papers=Count('paper'))

If you want to make an additional filter on the count, my suggestion is to use subqueries here is an example:

Student.objects.annotate(
    num_papers=Subquery(
        Paper.objects.filter(student=OuterRef('pk'))
            # The first .values call defines our GROUP BY clause
            # Its important to have a filtration on every field defined here
            # Otherwise you will have more than one row per group!
            # In this example we group only by student
            # and we already filtered by student.
            # any extra filtration you want should be make here too (before the grouping).
            .values('student')
            # Here we say: count how many rows we have per group 
            .annotate(cnt=Count('pk'))
            # Here we say: return only the count
            .values('cnt')
    )
)


来源:https://stackoverflow.com/questions/62317457/django-aggregate-query-include-zero-count

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!