Here are some excerpts from my copy of the 2014 draft standard N4140
22.5 Standard code conversion facets [locale.stdcvt]
3 F
Let us differentiate between wchar_t and string literals built using the L prefix.
wchar_t is just an integer type, which may be larger than char.
String literals using the L prefix will generate strings using wchar_t characters. Exactly what that means is implementation-dependent. There is no requirement that such literals use any particular encoding. They might use UTF-16, UTF-32, or something else that has nothing to do with Unicode at all.
So if you want a string literal which is guaranteed to be encoded in a Unicode format, across all platforms, use u8, u, or U prefixes for the string literal.
One interpretation of these two paragraphs is that wchar_t must be encoded as either UCS2 or UCS4.
No, that is not a valid interpretation. wchar_t has no encoding; it's just a type. It is data which is encoded. A string literal prefixed by L may or may not be encoded in UCS2 or UCS4.
If you provide codecvt_utf8 a string of wchar_ts which are encoded in UCS2 or UCS4 (as appropriate to sizeof(wchar_t)), then it will work. But not because of wchar_t; it only works because the data you provide it is correctly encoded.
If 4.1 said "The facet shall convert between UTF-8 multibyte sequences and UCS2 or UCS4 or whatever encoding is imposed on wchar_t by the current global locale" there would be no problem.
The whole point of those codecvt_* facets is to perform locale-independent conversions. If you want locale-dependent conversions, you shouldn't use them. You should instead use the global codecvt facet.