PostgreSQL ignores dashes when ordering

你离开我真会死。 提交于 2019-12-22 09:09:59

问题


I have a PostgreSQL 8.4 database that is created with the da_DK.utf8 locale.

dbname=> show lc_collate;
 lc_collate
------------
 da_DK.utf8
(1 row)

When I select something from a table where I order on a character varying column I get a strange behaviour IMO. When ordering the result PostgreSQL ignores dashes that prefixes the value, e.g.:

 select name from mytable order by name asc;

May return something like

 name
 ----------------
 Ad...
 Ae...
 Ag...
 - Ak....
 At....

The dash prefix seems to be ignored.

I can fix this issue by converting the column to latin1 when ordering:

 select name from mytable order by convert_to(name, 'latin1') asc;

The I get the expected result as:

 name
 ----------------
 - Ak....
 Ad...
 Ae...
 Ag...
 At....

Why does the dash prefix get ignored by default? Can that behavior be changed?


回答1:


This is because da_DK.utf8 locale defines it this way. Linux locale aware utilities, for example sort will also work like this.

Your convert_to(name, 'latin1') will break if it finds a character which is not in Latin 1 character set, for example , so it isn't a good workaround.

You can use order by convert_to(name, 'SQL_ASCII'), which will ignore locale defined sort and simply use byte values.


Ugly hack edit:

order by
  (
    ascii(name) between ascii('a') and ascii('z')
    or ascii(name) between ascii('A') and ascii('Z')
    or ascii(name)>127
  ),
  name;

This will sort first anything which starts with ASCII non-letter. This is very ugly, because sorting further in string would behave strange, but it can be good enough for you.




回答2:


A workaround that will work in my specific case is to replace dashes with exclamation points. I happen to know that I will never get exclamation points and it will be sorted before any letters or digits.

select name from mytable order by translate(name, '-', '!') asc

It will certainly affect performance so I may look into creating a special column for sorting but I really don't like that either...




回答3:


I don't know how seems ordering rules for Dutch, but for Polish special characters like space, dashes etc are not "counted" in sorting in most dictionaries. Some good sort routines do the same and ignores such special characters. Probably in Dutch there is similar rule, and this rule is implemented by Ubuntu locale aware sort function.



来源:https://stackoverflow.com/questions/4955386/postgresql-ignores-dashes-when-ordering

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!