PostgreSQL ORDER BY issue - natural sort

前端 未结 7 1282
日久生厌
日久生厌 2020-12-01 16:38

I\'ve got a Postgres ORDER BY issue with the following table:

em_code  name
EM001    AAA
EM999    BBB
EM1000   CCC

To insert a

相关标签:
7条回答
  • 2020-12-01 16:50

    I wrote about this in detail in this related question:

    Humanized or natural number sorting of mixed word-and-number strings

    (I'm posting this answer as a useful cross-reference only, so it's community wiki).

    0 讨论(0)
  • 2020-12-01 16:51

    This always comes up in questions and in my own development and I finally tired of tricky ways of doing this. I finally broke down and implemented it as a PostgreSQL extension:

    https://github.com/Bjond/pg_natural_sort_order

    It's free to use, MIT license.

    Basically it just normalizes the numerics (zero pre-pending numerics) within strings such that you can create an index column for full-speed sorting au naturel. The readme explains.

    The advantage is you can have a trigger do the work and not your application code. It will be calculated at machine-speed on the PostgreSQL server and migrations adding columns become simple and fast.

    0 讨论(0)
  • 2020-12-01 16:56

    The reason is that the string sorts alphabetically (instead of numerically like you would want it) and 1 sorts before 9. You could solve it like this:

    SELECT * FROM employees ORDER BY substring(em_code, 3)::int DESC
    

    It would be more efficient to drop the redundant 'EM' from your em_code - if you can - and save an integer number to begin with.


    Additional answer to question in comment

    To strip any and all non-digits from a string:

    SELECT regexp_replace(em_code, E'\\D','','g')
    FROM employees
    

    \D is the regular expression class-shorthand for "non-digits".
    'g' as 4th parameter is the "globally" switch to apply the replacement to every occurrence in the string, not just the first.

    So I replace every non-digit with the empty string distilling solely digits from the string.

    0 讨论(0)
  • 2020-12-01 16:57

    you can use just this line "ORDER BY length(substring(em_code FROM '[0-9]+')), em_code"

    0 讨论(0)
  • 2020-12-01 17:08

    One approach you can take is to create a naturalsort function for this. Here's an example, written by Postgres legend RhodiumToad.

    create or replace function naturalsort(text)
        returns bytea language sql immutable strict as $f$
        select string_agg(convert_to(coalesce(r[2], length(length(r[1])::text) || length(r[1])::text || r[1]), 'SQL_ASCII'),'\x00')
        from regexp_matches($1, '0*([0-9]+)|([^0-9]+)', 'g') r;
    $f$;
    

    Source: http://www.rhodiumtoad.org.uk/junk/naturalsort.sql

    To use it simply call the function in your order by:

    SELECT * FROM employees ORDER BY naturalsort(em_code) DESC
    
    0 讨论(0)
  • 2020-12-01 17:10

    I came up with something slightly different.

    The basic idea is to create an array of tuples (integer, string) and then order by these. The magic number 2147483647 is int32_max, used so that strings are sorted after numbers.

      ORDER BY ARRAY(
        SELECT ROW(
          CAST(COALESCE(NULLIF(match[1], ''), '2147483647') AS INTEGER),
          match[2]
        )
        FROM REGEXP_MATCHES(col_to_sort_by, '(\d*)|(\D*)', 'g')
        AS match
      )
    
    0 讨论(0)
提交回复
热议问题