codecvt

Deprecated header <codecvt> replacement

戏子无情 提交于 2020-04-18 05:47:58
问题 A bit of foreground: my task required converting UTF-8 XML file to UTF-16 (with proper header, of course). And so I searched about usual ways of converting UTF-8 to UTF-16, and found out that one should use templates from <codecvt> . But now when it is deprecated, I wonder what is the new common way of doing the same task? (Don't mind using Boost at all, but other than that I prefer to stay as close to standard library as possible.) 回答1: std::codecvt template from <locale> itself isn't

std::codecvt::do_in method overloading vs the rest of base methods

假装没事ソ 提交于 2019-12-13 04:39:00
问题 I have overloaded do_in method of std::codecvt : #include <iostream> #include <locale> #include <string> class codecvt_to_upper : public std::codecvt<char, char, std::mbstate_t> { public: explicit codecvt_to_upper(size_t r = 0) : std::codecvt<char, char, std::mbstate_t>(r) {} protected: result do_in(state_type& state, const extern_type* from, const extern_type* from_end, const extern_type*& from_next, intern_type* to, intern_type* to_end, intern_type*& to_next) const; result do_out(state_type

Using ICU to implement my own codecvt facet

孤街醉人 提交于 2019-12-12 03:06:38
问题 I want to implement a codecvt facet using ICU to convert from any character encoding (that ICU supports) to UTF-8 internally. I'm aware that codecvt_byname exists and that it can be used to do part of what I want as shown in this example. The problems with that example are that it (1) uses wide character streams (I want to use "regular", byte-oriented streams) and (2) requires 2 streams to perform the conversion. Instead, I want a single stream like: locale loc( locale(), new icu_codecvt(

UTF-16 codecvt facet

こ雲淡風輕ζ 提交于 2019-12-10 18:46:06
问题 Extending from this questions about locales And described in this question: What I really wanted to do was install a codecvt facet into the locale that understands UTF-16 files. I could write my own. But I am not a UTF expert and as such I am sure I would get it nearly correct; but it would break at the most inconvenient time. So I was wondering if there are any resources (on the web) of pre-build codecvt (or other) facets that can be used from C++ that are peer reviewed and tested? The

parsing strings with value modifiers ('-', '%') at the end

核能气质少年 提交于 2019-12-10 18:22:21
问题 I try to get to grips with parsing. I have some data that comes in a de-de format with additional information at the end of the string. I managed to get the de-de part correct but I struggle in getting the - and % parsed correctly. I read up on codecvt but I do not understand the topic. Here is a reflection of what I understand so far and an example of what I need to do. #include <string> #include <locale> #include <iostream> #include <sstream> using namespace std; #define EXPECT_EQ(actual,

How to read utf-16 file into utf-8 std::string line by line

◇◆丶佛笑我妖孽 提交于 2019-12-10 15:42:07
问题 I'm working with code that expects utf8-encoded std::string variables. I want to be able to handle a user-supplied file that potentially has utf-16 encoding (I don't know the encoding at design time, but eventually want to be able to deal with utf8/16/32), read it line-by-line, and forward each line to the rest of the code as a utf8-encoded std::string. I have c++11 (really, the current MSVC subset of c++11) and boost 1.55.0 to work with. I'll need the code to work on both Linux and Windows

Reading/writing/printing UTF-8 in C++11

蓝咒 提交于 2019-12-08 16:25:38
问题 I have been exploring C++11's new Unicode functionality, and while other C++11 encoding questions have been very helpful, I have a question about the following code snippet from cppreference. The code writes and then immediately reads a text file saved with UTF-8 encoding. // Write std::ofstream("text.txt") << u8"z\u6c34\U0001d10b"; // Read std::wifstream file1("text.txt"); file1.imbue(std::locale("en_US.UTF8")); std::cout << "Normal read from file (using default UTF-8/UTF-32 codecvt)\n"; for

trouble with std::codecvt_utf8 facet

倖福魔咒の 提交于 2019-12-05 01:51:15
问题 Here is a snippet of a code that is using std::codecvt_utf8<> facet to convert from wchar_t to UTF-8. With Visual Studio 2012, my expectations are not met (see the condition at the end of the code). Are my expectations wrong? Why? Or is this a Visual Studio 2012 library issue? #include <locale> #include <codecvt> #include <cstdlib> int main () { std::mbstate_t state = std::mbstate_t (); std::locale loc (std::locale (), new std::codecvt_utf8<wchar_t>); typedef std::codecvt<wchar_t, char, std:

How do I write a std::codecvt facet?

点点圈 提交于 2019-12-03 18:01:51
问题 How do I write a std::codecvt facet? I'd like to write ones that go from UTF-16 to UTF-8, which go from UTF-16 to the systems current code page (windows, so CP_ACP), and to the system's OEM codepage (windows, so CP_OEM). Cross-platform is preferred, but MSVC on Windows is fine too. Are there any kinds of tutorials or anything of that nature on how to correctly use this class? 回答1: I've written one based on iconv. It can be used on windows or on any POSIX OS. (You will need to link with iconv

trouble with std::codecvt_utf8 facet

為{幸葍}努か 提交于 2019-12-03 16:24:14
Here is a snippet of a code that is using std::codecvt_utf8<> facet to convert from wchar_t to UTF-8. With Visual Studio 2012, my expectations are not met (see the condition at the end of the code). Are my expectations wrong? Why? Or is this a Visual Studio 2012 library issue? #include <locale> #include <codecvt> #include <cstdlib> int main () { std::mbstate_t state = std::mbstate_t (); std::locale loc (std::locale (), new std::codecvt_utf8<wchar_t>); typedef std::codecvt<wchar_t, char, std::mbstate_t> codecvt_type; codecvt_type const & cvt = std::use_facet<codecvt_type> (loc); wchar_t ch = L'