accent-insensitive

Regex for accent insensitive replacement in python

ⅰ亾dé卋堺 提交于 2020-11-28 07:43:23
问题 In Python 3, I'd like to be able to use re.sub() in an "accent-insensitive" way, as we can do with the re.I flag for case-insensitive substitution. Could be something like a re.IGNOREACCENTS flag: original_text = "¿It's 80°C, I'm drinking a café in a cafe with Chloë。" accent_regex = r'a café' re.sub(accent_regex, 'X', original_text, flags=re.IGNOREACCENTS) This would lead to "¿It's 80°C, I'm drinking X in X with Chloë。" (note that there's still an accent on "Chloë") instead of "¿It's 80°C, I

Regex for accent insensitive replacement in python

不想你离开。 提交于 2020-11-28 07:42:14
问题 In Python 3, I'd like to be able to use re.sub() in an "accent-insensitive" way, as we can do with the re.I flag for case-insensitive substitution. Could be something like a re.IGNOREACCENTS flag: original_text = "¿It's 80°C, I'm drinking a café in a cafe with Chloë。" accent_regex = r'a café' re.sub(accent_regex, 'X', original_text, flags=re.IGNOREACCENTS) This would lead to "¿It's 80°C, I'm drinking X in X with Chloë。" (note that there's still an accent on "Chloë") instead of "¿It's 80°C, I

SOLR and accented characters

蹲街弑〆低调 提交于 2019-12-25 09:05:14
问题 I have an index for occupations (identifier + occupation): <field name="occ_id" type="int" indexed="true" stored="true" required="true" /> <field name="occ_tx_name" type="text_es" indexed="true" stored="true" multiValued="false" /> <!-- Spanish --> <fieldType name="text_es" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words=

regex to also match accented characters

只愿长相守 提交于 2019-12-22 08:11:04
问题 I have the following PHP code: $search = "foo bar que"; $search_string = str_replace(" ", "|", $search); $text = "This is my foo text with qué and other accented characters."; $text = preg_replace("/$search_string/i", "<b>$0</b>", $text); echo $text; Obviously, "que" does not match "qué". How can I change that? Is there a way to make preg_replace ignore all accents? The characters that have to match (Spanish): á,Á,é,É,í,Í,ó,Ó,ú,Ú,ñ,Ñ I don't want to replace all accented characters before

MongoDB diacriticInSensitive search not showing all accented (words with diacritic mark) rows as expected and vice-versa

与世无争的帅哥 提交于 2019-12-19 09:20:33
问题 I have a document collection with following structure uid, name With a Index db.Collection.createIndex({name: "text"}) It contains following data 1, iphone 2, iphóne 3, iphonë 4, iphónë When I am doing text search for iphone I am getting only two records, which is unexpected actual output -------------- 1, iphone 2, iphóne If I search for iphonë db.Collection.find( { $text: { $search: "iphonë"} } ); I am getting --------------------- 3, iphonë 4, iphónë But Actually I am expecting following

MySQL REGEXP query - accent insensitive search

旧街凉风 提交于 2019-12-18 05:52:48
问题 I'm looking to query a database of wine names, many of which contain accents (but not in a uniform way, and so similar wines may be entered with or without accents) The basic query looks like this: SELECT * FROM `table` WHERE `wine_name` REGEXP '[[:<:]]Faugères[[:>:]]' which will return entries with 'Faugères' in the title, but not 'Faugeres' SELECT * FROM `table` WHERE `wine_name` REGEXP '[[:<:]]Faugeres[[:>:]]' does the opposite. I had thought something like: SELECT * FROM `table` WHERE

MySQL REGEXP query - accent insensitive search

爱⌒轻易说出口 提交于 2019-12-18 05:52:43
问题 I'm looking to query a database of wine names, many of which contain accents (but not in a uniform way, and so similar wines may be entered with or without accents) The basic query looks like this: SELECT * FROM `table` WHERE `wine_name` REGEXP '[[:<:]]Faugères[[:>:]]' which will return entries with 'Faugères' in the title, but not 'Faugeres' SELECT * FROM `table` WHERE `wine_name` REGEXP '[[:<:]]Faugeres[[:>:]]' does the opposite. I had thought something like: SELECT * FROM `table` WHERE

How to search string using Entity Framework with .Contains and with accent-insensitive

喜夏-厌秋 提交于 2019-12-12 15:33:50
问题 In my database, I have a table that stores cities. Some cities have accents like "Foz do Iguaçu". In my MVC application, I have a JSON that return a list of cities based in a word, however, few users aren't using accents to search for the city, for example "Foz do Iguacu". in my database I have "Foz do Igua Ç u" but users users searches for "Foz do Igua C u" How can I search records in my table, ignoring accents? Here is my code: using (ServiciliEntities db = new ServiciliEntities()) { List

How do I perform an accent insensitive compare in SQL Server for 1250 codepage

北战南征 提交于 2019-12-11 04:37:57
问题 There are already sever question and solution on accent insensitive search on stackoverflow, but none of them work for codepage 1250 (Central European and Eastern European languages). How do I perform an accent insensitive compare (e with è, é, ê and ë) in SQL Server? LINQ Where Ignore Accentuation and Case Ignoring accents in SQL Server using LINQ to SQL Modify search to make it Accent Insensitive in SQL Server Questions about accent insensitivity in SQL Server (Latin1_General_CI_AS) The

regex to also match accented characters

為{幸葍}努か 提交于 2019-12-05 14:52:47
I have the following PHP code: $search = "foo bar que"; $search_string = str_replace(" ", "|", $search); $text = "This is my foo text with qué and other accented characters."; $text = preg_replace("/$search_string/i", "<b>$0</b>", $text); echo $text; Obviously, "que" does not match "qué". How can I change that? Is there a way to make preg_replace ignore all accents? The characters that have to match (Spanish): á,Á,é,É,í,Í,ó,Ó,ú,Ú,ñ,Ñ I don't want to replace all accented characters before applying the regex, because the characters in the text should stay the same: "This is my foo text with qué