I have two models like this:
class User(models.Model):
email = models.EmailField()
class Report(models.Model):
user = models.ForeignKey(User)
Report.objects.filter(user__isnull=False).distinct()
This uses an INNER JOIN
(and then redundantly checks User.id
is not null).
Report.objects.filter(user__isnull=True)
This makes LEFT OUTER JOIN
, then checks User.id
is not null.
Queries based on joins will be quicker than subqueries, so this is quicker than newly available options such as in Django >= 3, for finding rows without a joining row:
Report.objects.filter(~Exists(User.objects.filter(report=OuterRef('pk'))))
This creates a WHERE NOT EXISTS (SELECT .. FROM User..)
so involves a potentially large intermediate result set (thanks @Tomasz Gandor).
This for Django <3, where filter()
can't be passed subqueries, also uses a subquery so is slower:
Report.objects.annotate(
no_users=~Exists(User.objects.filter(report=OuterRef('pk')))
).filter(no_users=True)
This can be combined with subqueries. In this example, a Textbook
has a number of Versions
(ie, version
has textbook_id
), and a version
has a number of Pages
(ie, page
has version_id
). The subquery gets the latest version of each textbook that has pages associated:
subquery = (
Version.objects
.filter(
# OuterRef joins to Version.textbook in outer query below
textbook=OuterRef('textbook'),
# excludes rows with no joined Page records
page__isnull=False)
# ordered so [:1] below gets highest (ie, latest) version number
.order_by('-number').distinct()
)
# Only the Version.ids of the latest versions that have pages returned by the subquery
books = Version.objects.filter(pk=Subquery(subquery.values('pk')[:1])).distinct()
To return rows that have a join to one or both of two tables, use Q objects (Page
and TextMarkup
both have nullable foreign keys joining to File
):
from django.db.models import Q
File.objects.filter(Q(page__isnull=False) | Q(textmarkup__isnull=False).distinct()