utf

Working with UTF8

痴心易碎 提交于 2019-12-10 17:43:29
问题 It seems like a rather complicated issue to work with std::string and UTF8 and I cannot find a good explanation of do's and dont's. How can I properly work with UTF8 in C++? It is rather confusing. I've found boost::locale and I set the global locale: std::locale::global(boost::locale::generator()("")); However, after this what do I need to think about, when can I get problems? Will writing/reading from file work as expected, string comparisons etc...? So far I'm aware of the following: std:

MySql UTF encoding

試著忘記壹切 提交于 2019-12-10 16:35:50
问题 java.sql.SQLException: Incorrect string value: '\xAC\xED\x00\x05sr...' for column 'xxxx' The column is a longtext in MYSQL with utf8 charset and utf8_general_ci collation. What is wrong? 回答1: It's a bit late, but you might want to know that \xAC\xED\x00\x05sr... is a magic number for Java serialization. Apparently your parameter is being serialized instead of being pasted as a string. 回答2: Assuming that those are hexadecimal escape codes, the text \xAC\xED\x00\x05sr... is not a valid UTF-8

Handling UTF filenames in Python

爷,独闯天下 提交于 2019-12-10 09:43:21
问题 I've read quite a bit on the topic already, including what seems to be the definitive guide on this topic here: http://docs.python.org/howto/unicode.html Perhaps for a more experienced developer, that guide may be enough. However, in my case, I'm more confused than when I started and still haven't resolved my issue. I am trying to read filenames using os.walk() and to obtain certain information about the files (such as filesize) before writing that information to a text file. This works as

How to print degree symbol on the window using qt5(QtQuick 2.1) and above

孤街醉人 提交于 2019-12-09 18:51:38
问题 When I was using up to qt4.8(qt quick 1.1) for gui then I am successfully able to print degree with \260 but when things got upgraded to qt5 and above then this stopped working. I searched on the net and found many relevant link such as (http://www.fileformat.info/info/unicode/char/00b0/index.htm) I tried but no help. Do I need to include some library for usinf UTF format or problem is sth else. Please some one help. What to do? @Revised, Here it is described what is being done. First I am

MSBuild.exe output encoding

和自甴很熟 提交于 2019-12-09 17:56:39
问题 I use MSBuild.exe for building solution on machine with russian language. But in TeamCity build log all russian chars in wrong encoding. How to setup MSBuild.exe for properly output (UTF-8 for example)? 回答1: Check /fileloggerparameters command line parameter here. It should be the same for console logger. e.g. MyLog.log file with diagnostic verbosity using UTF-8 encoding: /fileLoggerParameters:LogFile=MyLog.log;Encoding=UTF-8;Verbosity=diagnostic 来源: https://stackoverflow.com/questions

What is the difference between UTF-32 and UCS-4?

别来无恙 提交于 2019-12-09 05:07:41
问题 What is the difference between UTF-32 and UCS-4 ? Isn't UTF-32 supposed to be a fixed-width encoding ? 回答1: UTF-32 has started as a subset of UCS-4 . Now it is identical except that the UTF-32 standard has additional Unicode semantics. See details on wikipedia: The original ISO 10646 standard defines a 31-bit encoding form called UCS-4 , in which each encoded character in the Universal Character Set (UCS) is represented by a 32-bit friendly code value in the code space of integers between 0

What characters do not directly map from Cp1252 to UTF-8?

青春壹個敷衍的年華 提交于 2019-12-09 04:21:16
问题 I've read in several stackoverflow answers that some characters do not directly map (or are even "unmappable") when converting from Cp1252 (aka Windows-1252; they're the same, aren't they?) to UTF-8, e.g. here: https://stackoverflow.com/a/23399926/2018047 Can someone please shed some more light on this? Does that mean that if I batch/mass convert source code from cp1252 to utf-8 I'll get some characters that will end up as garbage? 回答1: This is how Windows 1252 codepage looks like. As you can

AJAX: post method with UTF-8

痞子三分冷 提交于 2019-12-09 03:54:27
问题 I'm trying to send data as UTF-8 over Ajax, but it's changing some data in unicode . I'll explain it with two short examples: A simple POST (without ajax) <form accept-charset="UTF-8" method="POST" action="test2.php"> <input type="text" class="" name="text"> <input type="submit" class="button" value="Submit"> </form> Meta and PHP headers are always set: <meta charset="utf-8"> header("Content-Type: text/html; charset=utf-8"); If I submit an Arabic letter ( ب ), and use strlen() it will return

How to correctly read url content with utf8 chars?

匆匆过客 提交于 2019-12-09 02:02:23
问题 public class URLReader { public static byte[] read(String from, String to, String string){ try { String text = "http://translate.google.com/translate_a/t?"+ "client=o&text="+URLEncoder.encode(string, "UTF-8")+ "&hl=en&sl="+from+"&tl="+to+""; URL url = new URL(text); BufferedReader in = new BufferedReader( new InputStreamReader(url.openStream(), "UTF-8")); String json = in.readLine(); byte[] bytes = json.getBytes("UTF-8"); in.close(); return bytes; //return text.getBytes(); } catch (Exception

Where are the unicode characters on the disk and what's the mapping process?

大憨熊 提交于 2019-12-08 05:13:14
问题 There are several unicode relevant questions has been confusing me for some time. For these reasons as follow I think the unicode characters are existed on disk. Execute echo "\u6211" in terminal, it will print the glyph corresponding to the unicode code point U+6211. There's a concept of UCD (unicode character database), and We can download it's latest version. UCD latest Some new version unicode characters like latest emojis can not display on my mac until I upgrade macOS version. So if the