utf-8

Java: Detect non-displayable chars for a given Character Encoding

ε祈祈猫儿з 提交于 2019-12-30 10:53:28
问题 I'm currently working on an application to validate and parse CSV-files. The CSV files have to be encoded in UTF-8, although sometimes we get files in a false encoding. The CSV-files most likely contain special characters of the German alphabet (Ä, Ö, Ü, ß) as most of the texts within the CSV file are in German language. For the part of the validator, i need to make sure, the file is UTF-8 encoded. As long as there are no special characters present, there is most likely no problem with

Java: Detect non-displayable chars for a given Character Encoding

梦想的初衷 提交于 2019-12-30 10:53:26
问题 I'm currently working on an application to validate and parse CSV-files. The CSV files have to be encoded in UTF-8, although sometimes we get files in a false encoding. The CSV-files most likely contain special characters of the German alphabet (Ä, Ö, Ü, ß) as most of the texts within the CSV file are in German language. For the part of the validator, i need to make sure, the file is UTF-8 encoded. As long as there are no special characters present, there is most likely no problem with

Why doesn't CFStringEncodings have UTF8 in Swift?

拜拜、爱过 提交于 2019-12-30 10:36:23
问题 I am trying to create a percent encoded string in Swift so I can safely send text as a GET request. I found some Objective C code which I am trying to convert to Swift. I've written the following Swift code: CFURLCreateStringByAddingPercentEscapes(nil, CFStringRef(encodedString), nil, CFStringRef("/%&=?$#+-~@<>|\\*,.()[]{}^!"), kCFStringEncodingUTF8) There is no kCFStringEncodingUTF8 in Swift ... If you right click the CFStringEncodings source you see there is a million things in there but no

fstream::open() Unicode or Non-Ascii characters don't work (with std::ios::out) on Windows

此生再无相见时 提交于 2019-12-30 10:26:40
问题 In a C++ project, I want to open a file ( fstream::open() ) (which seems to be a major problem). The Windows build of my program fails miserably. File "ä" (UTF-8 0xC3 0xA4) std::string s = ...; //Convert s std::fstream f; f.open(s.c_str(), std::ios::binary | std::ios::in); //Works (f.is_open() == true) f.close(); f.open(s.c_str(), std::ios::binary | std::ios::in | std::ios::out); //Doesn't work The string s is UTF-8 encoded, but then converted from UTF-8 to Latin1 (0xE4). I'm using Qt, so

std::string and UTF-8 encoded unicode

99封情书 提交于 2019-12-30 09:00:18
问题 If I understand well, it is possible to use both string and wstring to store UTF-8 text. With char, ASCII characters take a single byte, some chinese characters take 3 or 4, etc. Which means that str[3] doesn't necessarily point to the 4th character. With wchar_t same thing, but the minimal amount of bytes used per characters is always 2 (instead of 1 for char ), and a 3 or 4 byte wide character will take 2 wchar_t . Right ? So, what if I want to use string::find_first_of() or string::compare

Redirect to UTF-8 URL with ColdFusion

橙三吉。 提交于 2019-12-30 08:58:05
问题 I'm working on a system that uses UTF-8 characters in folder names for URLs. There's been no problem in navigating to these URLs and everything works as expected - except when issuing a redirect to another page on the site; whereupon the browser seems to encode the extended characters. To give an example, I'm attempting to redirect to the following relative URL: /geschäft/käfer/ If I visit that URL directly in the address bar, there's no problem. However if I change the location header to

Problems displaying French accented characters in UTF-8

两盒软妹~` 提交于 2019-12-30 08:23:09
问题 I'm working on a French language site built in CakePHP. I have tried multiple functions to try and convert the text into UTF-8 and display properly, but have had no success so far - any accented letters are displaying as a black diamond with a question mark. They do display correctly when I change the char set in the browser to ISO-8859-1, but I'd like to make the while site UTF-8 compliant. I have used: html_entity_decode($string, ENT_QUOTES, 'UTF-8'); htmlspecialchars($string, ENT_QUOTES,

Why CONCAT() does not default to default charset in MySQL?

被刻印的时光 ゝ 提交于 2019-12-30 08:03:12
问题 What is the reason, that using CONCAT() in pure UTF-8 environment MySQL still treats concatenated string (when some col in expression is for example int or date) as some other charset (probably Latin-1)? MySQL environment seen from client ( \s ): Server characterset: utf8 Db characterset: utf8 Client characterset: utf8 Conn. characterset: utf8 Test dataset: CREATE TABLE `utf8_test` ( `id` int(10) unsigned NOT NULL auto_increment, `title` varchar(50) collate utf8_estonian_ci default NULL,

What is the range for Arabic-Indic Digits (Hindu–Arabic) numeral utf8 from 0 to 9

馋奶兔 提交于 2019-12-30 07:51:07
问题 What is the range for Arabic-Indic Digits (Hindu–Arabic) numeral utf8 from 0 to 9 for the use in regular expressions: to use in regex. 回答1: U+06F0–U+06F9. As can be easily seen when checking a Unicode code point chart or the Character Map. 回答2: finally I found the answer for only numbers: \x{0660}-\x{0669} for numbers and letters: \x{0600}-\x{06ff} Unicode 4.0 / ISO 10646 Plane 0 thanks you all. 来源: https://stackoverflow.com/questions/14834846/what-is-the-range-for-arabic-indic-digits-hindu

How is this website fixing the encoding?

倖福魔咒の 提交于 2019-12-30 07:42:38
问题 I am trying to turn this text: ×וויר. העתיד של רשתות חברתיות והתקשורת ×©×œ× ×• Into this text: אוויר. העתיד של רשתות חברתיות והתקשורת שלנו Somehow, this website: http://www.pixiesoft.com/flip/ Can do it, and I would like to know how I might be able to do it myself (with whatever programming language or software) Just saving the file as UTF8 won't do it. My motivation for this question is that I have a friend's exported XML file with the garbled text which I want