utf-8

Possible to force CMake/MSVC to use UTF-8 encoding for source files without a BOM? C4819

半城伤御伤魂 提交于 2020-12-28 20:03:45
问题 All our source code is valid UTF-8, however some users on Windows cannot build them because their system is configured for a different encoding. Without adding a BOM to source files, is it possible to tell MSVC to treat all source as UTF-8, irrespective of the users system encoding? See MSDN's link regarding this topic (requires adding BOM header). 回答1: You can try: add_compile_options("$<$<C_COMPILER_ID:MSVC>:/utf-8>") add_compile_options("$<$<CXX_COMPILER_ID:MSVC>:/utf-8>") By default,

c++ can't get “wcout” to print unicode, and leave “cout” working

送分小仙女□ 提交于 2020-12-25 01:14:25
问题 can't get "wcout" to print unicode string in multiple code pages, together with leaving "cout" to work please help me get these 3 lines to work together. std::wcout<<"abc "<<L'\u240d'<<" defg "<<L'א'<<" hijk"<<std::endl; std::cout<<"hello world from cout! \n"; std::wcout<<"hello world from wcout! \n"; output: abc hello world from cout! i tried: #include <io.h> #include <fcntl.h> _setmode(_fileno(stdout), _O_U8TEXT); problem: "cout" failed tried: std::locale mylocale(""); std::wcout.imbue

Transform UTF8 string to UCS-2 with replace invalid characters in java

懵懂的女人 提交于 2020-12-15 04:55:48
问题 I have a sting in UTF8: "Red🌹🌹Röses" I need that to be converted to valid UCS-2(or fixed size UTF-16BE without BOM, they are the same things) encoding, so the output will be: "Red Röses" as the "🌹" out of range of UCS-2. What I have tried: @Test public void testEncodeProblem() throws CharacterCodingException { String in = "Red\uD83C\uDF39\uD83C\uDF39Röses"; ByteBuffer input = ByteBuffer.wrap(in.getBytes()); CharsetDecoder utf8Decoder = StandardCharsets.UTF_16BE.newDecoder(); utf8Decoder

pyodbc doesn't correctly deal with unicode data

假如想象 提交于 2020-12-13 07:30:47
问题 I did successfully connected MySQL database with pyodbc, and it works well with ascii encoded data, but when I print data encoded with unicode(utf8), it raised error: UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-8: ordinal not in range(128) So I checked the string in the row: >>>row[3] '\xe7\xae\xa1\xe7\x90\u2020\xe5\u2018\u02dc' I found instructions about unicode in pyodbc github wiki These databases tend to use a single encoding and do not differentiate between

How is const std::wstring encoded and how to change to UTF-16

不羁的心 提交于 2020-12-12 09:41:58
问题 I created this minimum working C++ example snippet to compare bytes (by their hex representation) in a std::string and a std::wstring when defining a string with german non-ASCII characters in either type. #include <iostream> #include <iomanip> #include <string> int main(int, char**) { std::wstring wstr = L"äöüß"; std::string str = "äöüß"; for ( unsigned char c : str ) { std::cout << std::setw(2) << std::setfill('0') << std::hex << static_cast<unsigned short>(c) << ' '; } std::cout << std:

How is const std::wstring encoded and how to change to UTF-16

泄露秘密 提交于 2020-12-12 09:38:07
问题 I created this minimum working C++ example snippet to compare bytes (by their hex representation) in a std::string and a std::wstring when defining a string with german non-ASCII characters in either type. #include <iostream> #include <iomanip> #include <string> int main(int, char**) { std::wstring wstr = L"äöüß"; std::string str = "äöüß"; for ( unsigned char c : str ) { std::cout << std::setw(2) << std::setfill('0') << std::hex << static_cast<unsigned short>(c) << ' '; } std::cout << std:

Arabic characters in URL while sharing on Twitter

荒凉一梦 提交于 2020-12-08 05:34:12
问题 I'm facing an issue trying to sharing an URL which includes arabic characters on Twitter: http://example.com/قرعة-تصفيات-أفريقيا-مصر-تواجه-نيجيريا/ When i click on "share" the same URL is showed in the tweet box, but when I actually tweet, it just links to http://example.com , and the rest of the URL is lost. I tried using urlencode() , but the generated URL is too long and impossible tweet. How could I solve this? 回答1: If you are owner of website, you can write htaccess RewriteRule for

Arabic characters in URL while sharing on Twitter

Deadly 提交于 2020-12-08 05:31:19
问题 I'm facing an issue trying to sharing an URL which includes arabic characters on Twitter: http://example.com/قرعة-تصفيات-أفريقيا-مصر-تواجه-نيجيريا/ When i click on "share" the same URL is showed in the tweet box, but when I actually tweet, it just links to http://example.com , and the rest of the URL is lost. I tried using urlencode() , but the generated URL is too long and impossible tweet. How could I solve this? 回答1: If you are owner of website, you can write htaccess RewriteRule for

Difference between encoding utf-8 and utf8 in Python 3.5

女生的网名这么多〃 提交于 2020-12-08 05:22:26
问题 What is the difference between encoding utf-8 and utf8 (if there is any)? Given the following example: u = u'€' print('utf-8', u.encode('utf-8')) print('utf8 ', u.encode('utf8')) It produces the following output: utf-8 b'\xe2\x82\xac' utf8 b'\xe2\x82\xac' 回答1: There's no difference. See the table of standard encodings. Specifically for 'utf_8' , the following are all valid aliases: 'U8', 'UTF', 'utf8' Also note the statement in the first paragraph: Notice that spelling alternatives that only

Sending MIME-encoded email attachments with utf-8 filenames

自作多情 提交于 2020-12-06 13:52:09
问题 Hello dear people, I spent the last 3 days searching the web for an answer and I couldn't find any. I found plenty of "almost" cases but none was exactly what I was looking for. I am able to get the subject and the body message in Hebrew, but I can't get the attached file name in Hebrew. Btw, I'm not interested in third party programs like PHPMailer ect. This is what I get: W_W(W'W_W_.pdf This is what I want to get: שלום.pdf Here is my code, very simple.. $boundary = uniqid("HTMLEMAIL");