wcstombs: character encoding?

前端 未结 4 612
轻奢々
轻奢々 2021-01-14 14:36

wcstombs documentation says, it \"converts the sequence of wide-character codes to multibyte string\". But it never says what is a \"wide-character\".

Is it implici

4条回答
  •  天涯浪人
    2021-01-14 14:49

    According to the C standard, wchar_t type is "capable of representing any character in the current locale". The standard doesn't say what the encoding for wchar_t is. In fact, the limits on WCHAR_MIN and WCHAR_MAX are [0, 255] or [-127, 127], depending upon whether wchar_t is unsigned or signed.

    A multibyte character can use more than one byte. A multibyte string is made of one or more multibyte characters. In a multibyte string, each character need not be of equal number of bytes (UTF-8 is an example). Whereas, an object of type wchar_t has a fixed size (in a given implementation, of course).

    As an aside, I can also find the following in my copy of the C99 draft:

    __STDC_ISO_10646__ An integer constant of the form yyyymmL (for example, 199712L). If this symbol is defined, then every character in the Unicode required set, when stored in an object of type wchar_t, has the same value as the short identifier of that character. The Unicode required set consists of all the characters that are defined by ISO/IEC 10646, along with all amendments and technical corrigenda, as of the specified year and month.

    So, if I understood correctly, if __STDC_ISO_10646__ is defined, then wchar_t can store Unicode characters.

提交回复
热议问题