Python elasticsearch-dsl django pagination

后端 未结 4 456
长发绾君心
长发绾君心 2021-01-02 04:35

How can i use django pagination on elasticsearch dsl. My code:

query = MultiMatch(query=q, fields=[\'title\', \'body\'], fuzziness=\'AUTO\')

s = Search(usin         


        
相关标签:
4条回答
  • 2021-01-02 05:17

    A very simple solution is to use MultipleObjectMixin and extract your Elastic results in get_queryset() by overriding it. In this case Django will take care of the pagination itself if you add the paginate_by attribute.

    It should look like that:

    class MyView(MultipleObjectMixin, ListView):
        paginate_by = 10
    
        def get_queryset(self):
            object_list = []
            """ Query Elastic here and return the response data in `object_list`.
                If you wish to add filters when querying Elastic,
                you can use self.request.GET params here. """
            return object_list
    

    Note: The code above is broad and different from my own case so I can not guarantee it works. I used similar solution by inheriting other Mixins, overriding get_queryset() and taking advantage of Django's built in pagination - it worked great for me. As it was an easy fix I decided to post it here with a similar example.

    0 讨论(0)
  • 2021-01-02 05:19

    I found this paginator on this link:

    from django.core.paginator import Paginator, Page
    
    class DSEPaginator(Paginator):
        """
        Override Django's built-in Paginator class to take in a count/total number of items;
        Elasticsearch provides the total as a part of the query results, so we can minimize hits.
        """
        def __init__(self, *args, **kwargs):
            super(DSEPaginator, self).__init__(*args, **kwargs)
            self._count = self.object_list.hits.total
    
        def page(self, number):
            # this is overridden to prevent any slicing of the object_list - Elasticsearch has
            # returned the sliced data already.
            number = self.validate_number(number)
            return Page(self.object_list, number, self)
    

    and then in view i use:

        q = request.GET.get('q', None)
        page = int(request.GET.get('page', '1'))
        start = (page-1) * 10
        end = start + 10
    
        query = MultiMatch(query=q, fields=['title', 'body'], fuzziness='AUTO')
        s = Search(using=elastic_client, index='post').query(query)[start:end]
        response = s.execute()
    
        paginator = DSEPaginator(response, settings.POSTS_PER_PAGE)
        try:
            posts = paginator.page(page)
        except PageNotAnInteger:
            posts = paginator.page(1)
        except EmptyPage:
            posts = paginator.page(paginator.num_pages)
    

    this way it works perfectly..

    0 讨论(0)
  • 2021-01-02 05:24

    Another way forward is to create a proxy between the Paginator and the Elasticsearch query. Paginator requires two things, __len__ (or count) and __getitem__ (that takes a slice). A rough version of the proxy works like this:

    class ResultsProxy(object):
        """
        A proxy object for returning Elasticsearch results that is able to be
        passed to a Paginator.
        """
    
        def __init__(self, es, index=None, body=None):
            self.es = es
            self.index = index
            self.body = body
    
        def __len__(self):
            result = self.es.count(index=self.index,
                                   body=self.body)
            return result['count']
    
        def __getitem__(self, item):
            assert isinstance(item, slice)
    
            results = self.es.search(
                index=self.index,
                body=self.body,
                from_=item.start,
                size=item.stop - item.start,
            )
    
            return results['hits']['hits']
    

    A proxy instance can be passed to Paginator and will make requests to ES as needed.

    0 讨论(0)
  • 2021-01-02 05:28

    Following the advice from Danielle Madeley, I also created a proxy to search results which works well with the latest version of django-elasticsearch-dsl==0.4.4.

    from django.utils.functional import LazyObject
    
    class SearchResults(LazyObject):
        def __init__(self, search_object):
            self._wrapped = search_object
    
        def __len__(self):
            return self._wrapped.count()
    
        def __getitem__(self, index):
            search_results = self._wrapped[index]
            if isinstance(index, slice):
                search_results = list(search_results)
            return search_results
    

    Then you can use it in your search view like this:

    paginate_by = 20
    search = MyModelDocument.search()
    # ... do some filtering ...
    search_results = SearchResults(search)
    
    paginator = Paginator(search_results, paginate_by)
    page_number = request.GET.get("page")
    try:
        page = paginator.page(page_number)
    except PageNotAnInteger:
        # If page parameter is not an integer, show first page.
        page = paginator.page(1)
    except EmptyPage:
        # If page parameter is out of range, show last existing page.
        page = paginator.page(paginator.num_pages)
    

    Django's LazyObject proxies all attributes and methods from the object assigned to the _wrapped attribute. I am overriding a couple of methods that are required by Django's paginator, but don't work out of the box with the Search() instances.

    0 讨论(0)
提交回复
热议问题