问题
Since P1423R1 adds deleted ostream inserters for char8_t, char16_t, and char32_t, we are momentarily left in the situation that we need to write custom operators if we wish to stream these types to ostreams. When attempting to do this for MSVC 2019 16.2.0 Preview 2.0.
#include <iostream>
#include <string>
using namespace std::literals;
template<typename Tostream>
Tostream&
operator<<( Tostream& os, std::u8string_view string ) {
return os;
}
template<typename Tostream>
Tostream&
operator<<( Tostream& os, char8_t const* string ) {
return os << std::u8string_view( string );
}
/// this must be commented out to compile
//std::ostream&
//operator<<( std::ostream& os, char8_t const* string ) {
// return os << std::u8string_view( string );
//}
int
main() {
std::cout << u8"utf-8";
std::wcout << u8"utf-8";
}
I find that my templated attempt succeeds for wcout
but won't compile for cout
unless I uncomment the nontemplated operator<<
for char8_t const *
.
error C2280: 'std::basic_ostream<char,std::char_traits<char>> &std::operator <<<std::char_traits<char>>(std::basic_ostream<char,std::char_traits<char>> &,const char8_t *)': attempting to reference a deleted function
So the question is, in which case is it right? Is it right not compile for cout
or is it wrong to compile for wcout
? Either way this appears to be bug.
回答1:
P1423 hasn't been accepted for C++20 yet (though it did pass LEWG review in Kona), so that is interesting that Microsoft has already implemented (part of) it.
The exhibited behavior matches what is specified in P1423R1. During a recent LWG review, it was requested that the char8_t
, char16_t
, and char32_t
related overloads also be deleted for wide streams. P1423R2 includes that change, so compilation of the example code will also fail for std::wcout
when/if that is implemented. That revision hasn't been published in a mailing yet, but can be previewed at https://rawgit.com/sg16-unicode/sg16/master/papers/p1423r2.html.
As @Nicol mentioned, we don't yet have consensus for what the behavior of the deleted overloads should be. Should they implicitly transcode? If so, how are errors in transcoding handled? Or should they just stream bytes? But then what happens if a codecvt
facet is attached (it will expect execution encoding). Should there be a std::u8out
? Or should we provide better transcoding facilities and require they be explicitly invoked? SG16 will be working to answer these questions for C++23.
回答2:
Non-template functions always have priority in overload resolution over template functions. Therefore, std::operator<<(std::ostream&, const char8_t*)
will win over your template versions.
Also, the reason those functions were deleted is that it is unclear what behavior they should have (or more specifically, the committee isn't ready to make Unicode a real thing). If your goal is to just write the bytes of a UTF-8-encoded string to a byte stream, then you should do that specifically on your end, by explicitly converting the u8
string into a byte (char
) pointer, and then printing that:
std::cout << reinterpret_cast<const char*>(u8"utf-8");
Don't try to force the standard library to do something it explicitly does not want to do. Especially in this case, when C++23 may come along and provide implementations of these functions.
来源:https://stackoverflow.com/questions/56613226/outputting-char8-t-const-to-cout-and-wcout-one-compiles-one-not