Alphanumeric case in-sensitive sorting in postgres

前端 未结 6 1153
青春惊慌失措
青春惊慌失措 2020-12-17 09:33

I am new to postrges and want to sort varchar type columns. want to explain the problem with with below example:

table name: testsorting

   order             


        
相关标签:
6条回答
  • 2020-12-17 10:11

    As far as I'm concerned, I have used the PostgreSQL module citext and used the data type CITEXT instead of TEXT. It makes both sort and search on these columns case insensitive.

    The module can be installed with the SQL command CREATE EXTENSION IF NOT EXISTS citext;

    0 讨论(0)
  • 2020-12-17 10:13

    Answer strongly inspired from this one.
    By using a function it will be easier to keep it clean if you need it over different queries.

    CREATE OR REPLACE FUNCTION alphanum(str anyelement)
       RETURNS anyelement AS $$
    BEGIN
       RETURN (SUBSTRING(str, '^[^0-9]*'),
          COALESCE(SUBSTRING(str, '[0-9]+')::INT, -1) + 2000000);
    END;
    $$ LANGUAGE plpgsql IMMUTABLE;
    

    Then you could use it this way:

    SELECT name FROM testsorting ORDER BY alphanum(name);
    

    Test:

    WITH x(name) AS (VALUES ('b'), ('B'), ('a'), ('a1'),
       ('a11'), ('a2'), ('a20'), ('A'), ('a19'))
    SELECT name, alphanum(name) FROM x ORDER BY alphanum(name);
    
     name |  alphanum   
    ------+-------------
     a    | (a,1999999)
     A    | (A,1999999)
     a1   | (a,2000001)
     a2   | (a,2000002)
     a11  | (a,2000011)
     a19  | (a,2000019)
     a20  | (a,2000020)
     b    | (b,1999999)
     B    | (B,1999999)
    
    0 讨论(0)
  • 2020-12-17 10:15

    If the name is always in the 1 alpha followed by n numerics format then:

    select name
    from testsorting
    order by
        upper(left(name, 1)),
        (substring(name from 2) || '0')::integer
    
    0 讨论(0)
  • 2020-12-17 10:32

    My PostgreSQL sorts the way you want. The way PostgreSQL compares strings is determined by locale and collation. When you create database using createdb there is -l option to set locale. Also you can check how it is configured in your environment using psql -l:

    [postgres@test]$ psql -l
    List of databases
     Name    |  Owner   | Encoding |  Collate   |   Ctype    |   Access privileges
    ---------+----------+----------+------------+------------+-----------------------
     mn_test | postgres | UTF8     | pl_PL.UTF8 | pl_PL.UTF8 |
    

    As you see my database uses Polish collation.

    If you created database using other collation then you can use other collation in query just like:

    SELECT * FROM sort_test ORDER BY name COLLATE "C";
    SELECT * FROM sort_test ORDER BY name COLLATE "default";
    SELECT * FROM sort_test ORDER BY name COLLATE "pl_PL";
    

    You can list available collations by:

    SELECT * FROM pg_collation;
    

    EDITED:

    Oh, I missed that 'a11' must be before 'a2'.

    I don't think standard collation can solve alphanumeric sorting. For such sorting you will have to split string into parts just like in Clodoaldo Neto response. Another option that is useful if you frequently have to order this way is to separate name field into two columns. You can create trigger on INSERT and UPDATE that split name into name_1 and name_2 and then:

    SELECT name FROM sort_test ORDER BY name_1 COLLATE "en_EN", name_2;
    

    (I changed collation from Polish into English, you should use your native collation to sort letters like aącć etc)

    0 讨论(0)
  • 2020-12-17 10:36

    PostgreSQL uses the C library locale facilities for sorting strings. C library is provided by the host operating system. On Mac OS X or a BSD-family operating system,the UTF-8 locale definitions are broken and hence the results are as per collation "C".

    image attached for collation results with ubuntu 15.04 as host OS

    Check FAQ's on postgres wiki for more details : https://wiki.postgresql.org/wiki/FAQ

    0 讨论(0)
  • 2020-12-17 10:36

    I agree with Clodoaldo Neto's answer, but also don't forget to add the index

    CREATE INDEX testsorting_name on testsorting(upper(left(name,1)), substring(name from 2)::integer)
    
    0 讨论(0)
提交回复
热议问题