Django Q Queries & on the same field?

孤者浪人 提交于 2019-12-13 16:15:04

问题


So here are my models:

class Event(models.Model):
    user = models.ForeignKey(User, blank=True, null=True, db_index=True)
    name = models.CharField(max_length = 200, db_index=True)
    platform = models.CharField(choices = (("ios", "ios"), ("android", "android")), max_length=50)

class User(AbstractUser):
    email = models.CharField(max_length=50, null=False, blank=False, unique=True)

Event is like an analytics event, so it's very possible that I could have multiple events for one user, some with platform=ios and some with platform=android, if a user has logged in on multiple devices. I want to query to see how many users have both ios and android devices. So I wrote a query like this:

User.objects.filter(Q(event__platform="ios") & Q(event__platform="android")).count()

Which returns 0 results. I know this isn't correct. I then thought I would try to just query for iOS users:

User.objects.filter(Q(event__platform="ios")).count()

Which returned 6,717,622 results, which is unexpected because I only have 39,294 users. I'm guessing it's not counting the Users, but counting the Event instances, which seems like incorrect behavior to me. Does anyone have any insights into this problem?


回答1:


You can use annotations instead:

django.db.models import Count

User.objects.all().annotate(events_count=Count('event')).filter(events_count=2)

So it will filter out any user that has two events.

You can also use chained filters:

User.objects.filter(event__platform='android').filter(event__platform='ios')

Which first filter will get all users with android platform and the second one will get the users that also have iOS platform.




回答2:


This is generally an answer for a queryset with two or more conditions related to children objects.

Solution: A simple solution with two subqueries is possible, even without any join:

base_subq = Event.objects.values('user_id').order_by().distinct()
user_qs = User.objects.filter(
    Q(pk__in=base_subq.filter(platform="android")) &
    Q(pk__in=base_subq.filter(platform="ios"))
)

The method .order_by() is important if the model Event has a default ordering (see it in the docs about distinct() method).


Notes:

Verify the only SQL request that will be executed: (Simplified by removing "app_" prefix.)

>>> print(str(user_qs.query))
SELECT user.id, user.email FROM user WHERE (
    user.id IN (SELECT DISTINCT U0.user_id FROM event U0 WHERE U0.platform = 'android')
    AND
    user.id IN (SELECT DISTINCT U0.user_id FROM event U0 WHERE U0.platform = 'ios')
)
  • The function Q() is used because the same condition parameter (pk__in) can not be repeated in the same filter(), but also chained filters could be used instead: .filter(...).filter(...). (The order of filter conditions is not important and it is outweighed by preferences estimated by SQL server optimizer.)
  • The temporary variable base_subq is an "alias" queryset only to don't repeat the same part of expression that is never evaluated individually.
  • One join between User (parent) and Event (child) wouldn't be a problem and a solution with one subquery is also possible, but a join with Event and Event (a join with a repeated children object or with two children objects) should by avoided by a subquery in any case. Two subqueries are nice for readability to demonstrate the symmetry of the two filter conditions.

Another solution with two nested subqueries This non symmetric solution can be faster if we know that one subquery (that we put innermost) has a much more restrictive filter than another necessary subquery with a huge set of results. (example if a number of Android users would be huge)

ios_user_ids = (Event.objects.filter(platform="ios")
                .values('user_id').order_by().distinct())
user_ids = (Event.objects.filter(platform="android", user_id__in=ios_user_ids)
            .values('user_id').order_by().distinct())
user_qs = User.objects.filter(pk__in=user_ids)

Verify how it is compiled to SQL: (simplified again by removing app_ prefix and ".)

>>> print(str(user_qs.query))
SELECT user.id, user.email FROM user 
WHERE user.id IN (
    SELECT DISTINCT V0.user_id FROM event V0
    WHERE V0.platform = 'ios' AND V0.user_id IN (
        SELECT DISTINCT U0.user_id FROM event U0
        WHERE U0.platform = 'android'
    )
)

(These solutions work also in an old Django e.g. 1.8. A special subquery function Subquery() exists since Django 1.11 for more complicated cases, but we didn't need it for this simple question.)



来源:https://stackoverflow.com/questions/55400746/django-q-queries-on-the-same-field

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!