问题
I am new in Android and I\'m working on a query in SQLite. My problem is that when I use accent in strings e.g.
- ÁÁÁ
- ááá
- ÀÀÀ
- ààà
- aaa
- AAA
If I do:
SELECT * FROM TB_MOVIE WHERE MOVIE_NAME LIKE \'%a%\' ORDER BY MOVIE_NAME;
It\'s return:
- AAA
- aaa (It\'s ignoring the others)
But if I do:
SELECT * FROM TB_MOVIE WHERE MOVIE_NAME LIKE \'%à%\' ORDER BY MOVIE_NAME;
It\'s return:
- ààà (ignoring the title \"ÀÀÀ\")
I want to select strings in a SQLite DB without caring for the accents and the case. Please help.
回答1:
Generally, string comparisons in SQL are controlled by column or expression COLLATE
rules. In Android, only three collation sequences are pre-defined: BINARY (default), LOCALIZED and UNICODE. None of them is ideal for your use case, and the C API for installing new collation functions is unfortunately not exposed in the Java API.
To work around this:
- Add another column to your table, for example
MOVIE_NAME_ASCII
Store values into this column with the accent marks removed. You can remove accents by normalizing your strings to Unicode Normal Form D (NFD) and removing non-ASCII code points since NFD represents accented characters roughly as plain ASCII + combining accent markers:
String asciiName = Normalizer.normalize(unicodeName, Normalizer.Form.NFD) .replaceAll("[^\\p{ASCII}]", "");
Do your text searches on this ASCII-normalized column but display data from the original unicode column.
回答2:
In Android sqlite, LIKE
and GLOB
ignore both COLLATE LOCALIZED
and COLLATE UNICODE
(they only work for ORDER BY
). However, there is a solution without having to add extra columns to your table. As @asat explains in this answer, you can use GLOB
with a pattern that will replace each letter with all the available alternatives of that letter. In Java:
public static String addTildeOptions(String searchText) {
return searchText.toLowerCase()
.replaceAll("[aáàäâã]", "\\[aáàäâã\\]")
.replaceAll("[eéèëê]", "\\[eéèëê\\]")
.replaceAll("[iíìî]", "\\[iíìî\\]")
.replaceAll("[oóòöôõ]", "\\[oóòöôõ\\]")
.replaceAll("[uúùüû]", "\\[uúùüû\\]")
.replace("*", "[*]")
.replace("?", "[?]");
}
And then (not literally like this, of course):
SELECT * from table WHERE lower(column) GLOB "*addTildeOptions(searchText)*"
This way, for example in Spanish, a user searching for either mas or más will get the search converted into m[aáàäâã]s, returning both results.
It is important to notice that GLOB
ignores COLLATE NOCASE
, that's why I converted everything to lower case both in the function and in the query. Notice also that the lower()
function in sqlite doesn't work on non-ASCII characters - but again those are probably the ones that you are already replacing!
The function also replaces both GLOB
wildcards, *
and ?
, with "escaped" versions.
回答3:
You can use Android NDK to recompile the SQLite source including the desired ICU (International Components for Unicode). Explained in russian here: http://habrahabr.ru/post/122408/
The process of compiling the SQLilte with source with ICU explained here:
How to compile sqlite with ICU?
Unfortunately you will end up with different APKs for different CPUs.
回答4:
You need to look at these, not as accented characters, but as entirely different characters. You might as well be looking for a, b, or c. That being said, I would try using a regex for it. It would look something like:
SELECT * from TB_MOVIE WHERE MOVIE_NAME REGEXP '.*[aAàÀ].*' ORDER BY MOVIE_NAME;
来源:https://stackoverflow.com/questions/16282083/how-to-ignore-accent-in-sqlite-query-android