How do I find all rows of a PostgreSQL table that contain characters in some Unicode range, such as Cyrillic characters?
Figured it out! For Cyrillic:
SELECT * FROM "items" WHERE (title SIMILAR TO '%[\u0410-\u044f]%')
I got the range from http://symbolcodes.tlt.psu.edu/bylanguage/cyrillicchart.html. The characters have hex entities А
to я
, which are also my numbers above.
If you install the pgpcre extension, you can use this expression:
SELECT * FROM items WHERE title ~ pcre '\p{Cyrillic}';