diacritics

Printing accented characters in Python 2.7

落爺英雄遲暮 提交于 2019-12-01 17:19:34
I'm new to python. I'm trying to print accented characters, like this: # -*- coding: utf-8 -*- print 'éàÇÃãéèï' But when I execute this code, I get: >> ├®├á├ç├â├ú├®├¿├» I'm using 64-bit Windows 7 & Python 2.7.5, I have the code in file.py and execute it with python file.py As Wooble mentioned, if you change print 'éàÇÃãéèï' to print u'éàÇÃãéèï' It should work. Here is a good intro to unicode in python (both for 2.x and 3): The updated guide to unicode 来源: https://stackoverflow.com/questions/18445655/printing-accented-characters-in-python-2-7

How to know if a string contains accents

依然范特西╮ 提交于 2019-12-01 16:58:48
How to know if a string contains accents? if (Pattern.matches(".*[éèàù].*", input)) { .... } add whatever accents you want to that list Jack I think the best thing you can do is using a normalizer that splits unicode characters with accents into two separate character. Java includes this in class Normalizer , see here . This, for example, will split U+00C1 LATIN CAPITAL LETTER A WITH ACUTE into U+0041 LATIN CAPITAL LETTER A U+0301 COMBINING ACUTE ACCENT and will do this for every character that has accents or other diacritical mark ( http://en.wikipedia.org/wiki/Diacritic ). Then you can check

fgetcsv is eating the first letter of a String if it's an Umlaut

余生颓废 提交于 2019-12-01 15:21:59
问题 I am importing contents from an Excel-generated CSV-file into an XML document like: $csv = fopen($csvfile, r); $words = array(); while (($pair = fgetcsv($csv)) !== FALSE) { array_push($words, array('en' => $pair[0], 'de' => $pair[1])); } The inserted data are English/German expressions. I insert these values into an XML structure and output the XML as following: $dictionary = new SimpleXMLElement('<dictionary></dictionary>'); //do things $dom = dom_import_simplexml($dictionary) ->

How to know if a string contains accents

≯℡__Kan透↙ 提交于 2019-12-01 15:11:11
问题 How to know if a string contains accents? 回答1: if (Pattern.matches(".*[éèàù].*", input)) { .... } add whatever accents you want to that list 回答2: I think the best thing you can do is using a normalizer that splits unicode characters with accents into two separate character. Java includes this in class Normalizer , see here. This, for example, will split U+00C1 LATIN CAPITAL LETTER A WITH ACUTE into U+0041 LATIN CAPITAL LETTER A U+0301 COMBINING ACUTE ACCENT and will do this for every

Javascript removing accents

故事扮演 提交于 2019-12-01 12:21:51
问题 I want to use Evan Elliott's code (below) to remove accents in strings but its returns an "a" instead of the respective vanilla version of each character. I declare <meta charset="utf-8"> at the top of my page. function NormalizeString(s){ var r=s.toLowerCase(); var r=s.toLowerCase(); r = r.replace(new RegExp("\\s", 'g'),""); r = r.replace(new RegExp("[àáâãäå]", 'g'),"a"); r = r.replace(new RegExp("æ", 'g'),"ae"); r = r.replace(new RegExp("ç", 'g'),"c"); r = r.replace(new RegExp("[èéêë]", 'g'

MySQL DB selects records with and without umlauts. e.g: '.. where something = FÖÖ'

☆樱花仙子☆ 提交于 2019-12-01 05:49:37
My Table collation is "utf8_general_ci". If i run a query like: SELECT * FROM mytable WHERE myfield = "FÖÖ" i get results where: ... myfield = "FÖÖ" ... myfield = "FOO" is this the default for "utf8_general_ci"? What collation should i use to only get records where myfield = "FÖÖ"? Gunni SELECT * FROM table WHERE some_field LIKE ('%ö%' COLLATE utf8_bin) A list of the collations offered by MySQL for Unicode character sets can be found here: http://dev.mysql.com/doc/refman/5.0/en/charset-unicode-sets.html If you want to go all-out and require strings to be absolutely identical in order to test

GWT: Character encoding umlauts

ぃ、小莉子 提交于 2019-12-01 03:49:44
I want to set a text in a label: labelDemnaechst.setText(" Demnächst fällig:"); On the output in the application the characters "ä" are displayed wrong. How can I display them well? well you have to encode your special charactars to Unicode. You can finde a list of the representive Unicode characters here . Your examle would look like this: labelDemnaechst.setText("Demn\u00E4lachst f\u00E4llig:"); Hope this helps, if noone has a better solution. Appendix: Thanks Thomas for your tipp, you really have to change the format in which eclipse safes it's source files. Per default it uses something

How to remove accent in Python 3.5 and get a string with unicodedata or other solutions?

对着背影说爱祢 提交于 2019-12-01 01:32:48
问题 I am trying to get a string to use in google geocoding api.I ve checked a lot of threads but I am still facing problem and I don't understand how to solve it. I need addresse1 to be a string without any special characters. Addresse1 is for example: "32 rue d'Athènes Paris France". addresse1= collect.replace(' ','+').replace('\n','') addresse1=unicodedata.normalize('NFKD', addresse1).encode('utf-8','ignore') here I got a string without any accent... Ho no... It is not a string but a bytes. So

GWT: Character encoding umlauts

余生颓废 提交于 2019-12-01 00:59:02
问题 I want to set a text in a label: labelDemnaechst.setText(" Demnächst fällig:"); On the output in the application the characters "ä" are displayed wrong. How can I display them well? 回答1: well you have to encode your special charactars to Unicode. You can finde a list of the representive Unicode characters here. Your examle would look like this: labelDemnaechst.setText("Demn\u00E4lachst f\u00E4llig:"); Hope this helps, if noone has a better solution. Appendix: Thanks Thomas for your tipp, you

ASP MVC3 FileResult with accents + IE8 - bugged?

萝らか妹 提交于 2019-11-30 20:57:31
If the file name contains accents, it works as expected in Opera, FF, Chrome and IE9. But in IE8 file type is "unknown file type", and shows "file" as the file name (actually the last part of the URL). Does anyone know a workaround? Other than replacing the "special" characters in the file name? The test code: (file | new project | add controller) public class FileController : Controller { public ActionResult Index(bool? Accents) { byte[] content = new byte[] { 1, 2, 3, 4 }; return File(content, "application/octet-stream", true.Equals(Accents) ? "dsaé.txt" : "dsae.txt"); } } test it like this: