Annotating a Django queryset with a left outer join?

前端 未结 10 721
予麋鹿
予麋鹿 2020-12-14 01:06

Say I have a model:

class Foo(models.Model):
    ...

and another model that basically gives per-user information about Foo:

相关标签:
10条回答
  • 2020-12-14 01:40

    maparent's comment put me on the right way:

    from django.db.models.sql.datastructures import Join
    
    for alias in qs.query.alias_map.values():
      if isinstance(alias, Join):
        alias.nullable = True
    
    qs.query.promote_joins(qs.query.tables)
    
    0 讨论(0)
  • 2020-12-14 01:44

    This answer might not be exactly what you are looking for but since its the first result in google when searching for "django annotate outer join" so I will post it here.

    Note: tested on Djang 1.7

    Suppose you have the following models

    class User(models.Model):
        name = models.CharField()
    
    class EarnedPoints(models.Model):
        points = models.PositiveIntegerField()
        user = models.ForgeinKey(User)
    

    To get total user points you might do something like that

     User.objects.annotate(points=Sum("earned_points__points"))
    

    this will work but it will not return users who have no points, here we need outer join without any direct hacks or raw sql

    You can achieve that by doing this

     users_with_points = User.objects.annotate(points=Sum("earned_points__points"))
     result = users_with_points | User.objects.exclude(pk__in=users_with_points)
    

    This will be translated into OUTER LEFT JOIN and all users will be returned. users who has no points will have None value in their points attribute.

    Hope that helps

    0 讨论(0)
  • 2020-12-14 01:45

    You could do this using simonw's django-queryset-transform to avoid hard-coding a raw SQL query - the code would look something like this:

    def userfoo_retriever(qs):
        userfoos = dict((i.pk, i) for i in UserFoo.objects.filter(foo__in=qs))
        for i in qs:
            i.userfoo = userfoos.get(i.pk, None)
    
    for foo in Foo.objects.filter(…).tranform(userfoo_retriever):
        print foo.userfoo
    

    This approach has been quite successful for this need and to efficiently retrieve M2M values; your query count won't be quite as low but on certain databases (cough MySQL cough) doing two simpler queries can often be faster than one with complex JOINs and many of the cases where I've most needed it had additional complexity which would have been even harder to hack into an ORM expression.

    0 讨论(0)
  • 2020-12-14 01:46

    A solution with raw might look like

    foos = Foo.objects.raw("SELECT foo.* FROM foo LEFT OUTER JOIN userfoo ON (foo.id = userfoo.foo_id AND foo.user_id = %s)", [request.user.id])
    

    You'll need to modify the SELECT to include extra fields from userfoo which will be annotated to the resulting Foo instances in the queryset.

    0 讨论(0)
  • 2020-12-14 01:47

    The only way I see to do this without using raw etc. is something like this:

    Foo.objects.filter(
        Q(userfoo_set__isnull=True)|Q(userfoo_set__isnull=False)
    ).annotate(bar=Case(
        When(userfoo_set__user_id=request.user, then='userfoo_set__bar')
    ))
    

    The double Q trick ensures that you get your left outer join.

    Unfortunately you can't set your request.user condition in the filter() since it may filter out successful joins on UserFoo instances with the wrong user, hence filtering out rows of Foo that you wanted to keep (which is why you ideally want the condition in the ON join clause instead of in the WHERE clause).

    Because you can't filter out the rows that have an unwanted user value, you have to select rows from UserFoo with a CASE.

    Note also that one Foo may join to many UserFoo records, so you may want to consider some way to retrieve distinct Foos from the output.

    0 讨论(0)
  • 2020-12-14 01:49

    Notice: This method does not work in Django 1.6+. As explained in tcarobruce's comment below, the promote argument was removed as part of ticket #19849: ORM Cleanup.


    Django doesn't provide an entirely built-in way to do this, but it's not neccessary to construct an entirely raw query. (This method doesn't work for selecting * from UserFoo, so I'm using .comment as an example field to include from UserFoo.)

    The QuerySet.extra() method allows us to add terms to the SELECT and WHERE clauses of our query. We use this to include the fields from UserFoo table in our results, and limit our UserFoo matches to the current user.

    results = Foo.objects.extra(
        select={"user_comment": "UserFoo.comment"},
        where=["(UserFoo.user_id IS NULL OR UserFoo.user_id = %s)"],
        params=[request.user.id]
    )
    

    This query still needs the UserFoo table. It would be possible to use .extras(tables=...) to get an implicit INNER JOIN, but for an OUTER JOIN we need to modify the internal query object ourself.

    connection = (
        UserFoo._meta.db_table, User._meta.db_table,  # JOIN these tables
        "user_id",              "id",                 # on these fields
    )
    
    results.query.join(  # modify the query
        connection,      # with this table connection
        promote=True,    # as LEFT OUTER JOIN
    )
    

    We can now evaluate the results. Each instance will have a .user_comment property containing the value from UserFoo, or None if it doesn't exist.

    print results[0].user_comment
    

    (Credit to this blog post by Colin Copeland for showing me how to do OUTER JOINs.)

    0 讨论(0)
提交回复
热议问题