Improving OFFSET performance in PostgreSQL

后端 未结 5 867
失恋的感觉
失恋的感觉 2021-01-29 20:46

I have a table I\'m doing an ORDER BY on before a LIMIT and OFFSET in order to paginate.

Adding an index on the ORDER BY column makes a massive difference to performance

5条回答
  •  谎友^
    谎友^ (楼主)
    2021-01-29 21:18

    recently i worked over a problem like this, and i wrote a blog about how face that problem. is very like, i hope be helpfull for any one. i use lazy list approach with partial adquisition. i Replaced the limit and offset or the pagination of query to a manual pagination. In my example, the select returns 10 millions of records, i get them and insert them in a "temporal table":

    create or replace function load_records ()
    returns VOID as $$
    BEGIN
    drop sequence if exists temp_seq;
    create temp sequence temp_seq;
    insert into tmp_table
    SELECT linea.*
    FROM
    (
    select nextval('temp_seq') as ROWNUM,* from table1 t1
     join table2 t2 on (t2.fieldpk = t1.fieldpk)
     join table3 t3 on (t3.fieldpk = t2.fieldpk)
    ) linea;
    END;
    $$ language plpgsql;
    

    after that, i can paginate without count each row but using the sequence assigned:

    select * from tmp_table where counterrow >= 9000000 and counterrow <= 9025000
    

    From java perspective, i implemented this pagination through partial adquisition with a lazy list. this is, a list that extends from Abstract list and implements get() method. The get method can use a data access interface to continue get next set of data and release the memory heap:

    @Override
    public E get(int index) {
      if (bufferParcial.size() <= (index - lastIndexRoulette))
      {
        lastIndexRoulette = index;
        bufferParcial.removeAll(bufferParcial);
        bufferParcial = new ArrayList();
            bufferParcial.addAll(daoInterface.getBufferParcial());
        if (bufferParcial.isEmpty())
        {
            return null;
        }
    
      }
      return bufferParcial.get(index - lastIndexRoulette);
    }

    by other hand, the data access interface use query to paginate and implements one method to iterate progressively, each 25000 records to complete it all.

    results for this approach can be seen here http://www.arquitecturaysoftware.co/2013/10/laboratorio-1-iterar-millones-de.html

提交回复
热议问题