Django: Duplicated logic between properties and queryset annotations

后端 未结 5 965
我寻月下人不归
我寻月下人不归 2020-12-31 10:48

When I want to define my business logic, I\'m struggling finding the right way to do this, because I often both need a property AND a custom queryset to get the same info. I

5条回答
  •  执念已碎
    2020-12-31 11:43

    TL;DR

    • Do you need to filter the "annotated field" results?

      • If Yes, "Keep" the manager and use it when required. In any other situation, use property logic
      • If No, remove the manager/annotation process and stick with property implementation, unless your table is small (~1000 entries) and not growing over the period.
    • The only advantage of annotation process I am seeing here is the filtering capability on the database level of the data


    I have conducted a few tests to reach the conclusion, here they are

    Environment

    • Django 3.0.7
    • Python 3.8
    • PostgreSQL 10.14

    Model Structure

    For the sake of simplicity and simulation, I am following the below model representation

    class ReporterManager(models.Manager):
        def article_count_qs(self):
            return self.get_queryset().annotate(
                annotate_article_count=models.Count('articles__id', distinct=True))
    
    
    class Reporter(models.Model):
        objects = models.Manager()
        counter_manager = ReporterManager()
        name = models.CharField(max_length=30)
    
        @property
        def article_count(self):
            return self.articles.distinct().count()
    
        def __str__(self):
            return self.name
    
    
    class Article(models.Model):
        headline = models.CharField(max_length=100)
        reporter = models.ForeignKey(Reporter, on_delete=models.CASCADE,
                                     related_name="articles")
    
        def __str__(self):
            return self.headline

    I have populated my database, both Reporter and Article model with random strings.

    • Reporter rows ~220K (220514)
    • Article rows ~1M (997311)

    Test Cases

    1. Random picking of Reporter instance and retrieves the article count. We usually do this in the Detail View
    2. A paginated result. We slice the queryset and iterates over the sliced queryset.
    3. Filtering

    I am using the %timeit-(ipython doc) command of Ipython shell to calculate the execution time

    Test Case 1

    For this, I have created these functions, which randomly pick instances from the database

    import random
    
    MAX_REPORTER = 220514
    
    
    def test_manager_random_picking():
        pos = random.randint(1, MAX_REPORTER)
        return Reporter.counter_manager.article_count_qs()[pos].annotate_article_count
    
    
    def test_property_random_picking():
        pos = random.randint(1, MAX_REPORTER)
        return Reporter.objects.all()[pos].article_count

    Results

    In [2]: %timeit test_manager_random_picking()
    8.78 s ± 6.1 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
    
    In [3]: %timeit test_property_random_picking()
    6.36 ms ± 221 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

    Test Case 2

    I have created another two functions,

    import random
    
    PAGINATE_SIZE = 50
    
    
    def test_manager_paginate_iteration():
        start = random.randint(1, MAX_REPORTER - PAGINATE_SIZE)
        end = start + PAGINATE_SIZE
        qs = Reporter.counter_manager.article_count_qs()[start:end]
        for reporter in qs:
            reporter.annotate_article_count
    
    
    def test_property_paginate_iteration():
        start = random.randint(1, MAX_REPORTER - PAGINATE_SIZE)
        end = start + PAGINATE_SIZE
        qs = Reporter.objects.all()[start:end]
        for reporter in qs:
            reporter.article_count

    Results

    In [8]: %timeit test_manager_paginate_iteration()
    4.99 s ± 312 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    
    In [9]: %timeit test_property_paginate_iteration()
    47 ms ± 1.16 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

    Test Case 3

    undoubtedly, annotation is the only way here

    Here you can see, the annotation process takes a huge amount of time as compared to the property implementation.

提交回复
热议问题