问题
I need to select data from table and sort them with ORDER BY clause. The problem is the column contains text data with czech diacritic. I cannot use COLLATE, because the DB is part of postgres cluster which was created with lc_collate = en_US.UTF-8 and I cannot afford downtime caused by recreating the cluster with correct lc_collate.
Sample data:
CREATE TABLE test (
id serial PRIMARY key,
name text
);
INSERT INTO test (name) VALUES ('Žoo'), ('Zoo'), ('ŽOO'), ('ZOO'),
('ŽoA'), ('ŽóA'), ('ŽoÁ'), ('ŽóÁ');
Ideal output:
SELECT * FROM test ORDER BY name COLLATE "cs_CZ.utf8";
id | name
----+------
2 | Zoo
4 | ZOO
5 | ŽoA
7 | ŽoÁ
6 | ŽóA
8 | ŽóÁ
1 | Žoo
3 | ŽOO
(8 rows)
Here I found kind of solution:
SELECT * FROM test ORDER BY name USING ~<~;
id | name
----+------
4 | ZOO
2 | Zoo
3 | ŽOO
5 | ŽoA
1 | Žoo
7 | ŽoÁ
6 | ŽóA
8 | ŽóÁ
(8 rows)
The result is close enough (for my usage) - the caroned letters are AFTER the non-caroned.
My slightly off-topic Postgresql anabasis with ~<~
operator
edit: turned into new question.
Back to the question: Is there other solution to get the ideal order besides recreating the postgres cluster with correct locale?
Also some insght on the ~<~
operator would be nice.
回答1:
Im not sure if I understand the question, because look like you already found a solution. The only think I can suggest is you can add a new field czechName
with the correct collate
http://www.postgresql.org/docs/current/static/sql-altertable.html
ADD [ COLUMN ] column_name data_type [ COLLATE collation ] [ column_constraint [ ... ] ]
回答2:
Similar to what Juan Carlos Oropeza suggests, you can try to change the collation of the column with:
ALTER TABLE test ALTER COLUMN "name" TYPE text COLLATE 'cs_CZ.utf8';
Reference: http://www.postgresql.org/docs/current/static/sql-altertable.html
回答3:
As @Igor pointed out in his comment, there is no need to recreate the postgres cluster with different lc_collate and deal with the caused downtime.
Exact steps that solved the problem were:
add/uncomment line
cs_CZ.UTF-8 UTF-8
in/etc/locale.gen
generate the new locale:
# locale-gen
define new collation in postgres:
CREATE COLLATION "cs_CZ.utf8" ( locale = 'cs_CZ.UTF-8' );
来源:https://stackoverflow.com/questions/35629454/order-by-with-diacritic-in-postgres