wchar_t is defined in wchar.h
Currently, if the developers want to use only wchar_t
, they can not do
this without getting
Note that wint_t
was introduced because wchar_t
might be a type subject to 'default promotion' rules when passed to printf()
et al. This matters, for example, when calling printf()
:
wchar_t wc = …;
printf("%lc", wc);
The value of wc
might be converted to wint_t
. If you're writing a function like printf()
which needs to use the va_arg()
macro from <stdarg.h>
, then you should use the type wint_t
to get the value.
The standard notes that wint_t
might be the same type as wchar_t
, but if wchar_t
is a (16-bit) short
(or unsigned short
), wint_t
might be (32-bit) int
. To a first approximation, wint_t
only matters when wchar_t
is a 16-bit type. The full rules are, of course, more complex. For example, int
could be a 16-bit type — but this is rarely a problem.
7.29 Extended multibyte and wide character utilities
<wchar.h>
7.29.1 Introduction
¶1 The header
<wchar.h>
defines four macros, and declares four data types, one tag, and many functions.326)2 The types declared are
wchar_t
andsize_t
(both described in 7.19);mbstate_t
which is a complete object type other than an array type that can hold the conversion state information necessary to convert between sequences of multibyte characters and wide characters;
wint_t
which is an integer type unchanged by default argument promotions that can hold any value corresponding to members of the extended character set, as well as at least one value that does not correspond to any member of the extended character set (see
WEOF
below);327) …326) See ‘‘future library directions’’ (7.31.16).
327)wchar_t
andwint_t
can be the same integer type.§7.19 Common definitions
<stddef.h>
¶2 … and
wchar_t
which is an integer type whose range of values can represent distinct codes for all members of the largest extended character set specified among the supported locales; the null character shall have the code value zero. Each member of the basic character set shall have a code value equal to its value when used as the lone character in an integer character constant if an implementation does not define
__STDC_MB_MIGHT_NEQ_WC__
.
See Why the argument type of putchar(), fputc(), and putc() is not char for one place where the 'default promotion' rules from the C standard are quoted. There are probably other questions where the information is available too.
If we need to avoid type conversion warnings when -Wconversion
compiler option is used, we need to change wint_t
to wchar_t
in the prototypes of all library functions, and put '#define WEOF (-1)' to the beginning of wchar.h
and wctype.h
For wchar.h
the command is:
sudo perl -i -pe 'print qq(#define WEOF (-1)\n) if $.==1; next unless /Copy SRC to DEST\./..eof; s/\bwint_t\b/wchar_t/g' /usr/include/wchar.h
For wctype.h
the command is:
sudo perl -i -pe 'print qq(#define WEOF (-1)\n) if $.==1; next unless /Wide-character classification functions/..eof; s/\bwint_t\b/wchar_t/g' /usr/include/wctype.h
Similarly, if you use other header files which use wint_t
, simply change wint_t
to wchar_t
in the prototypes in those header files.
Explanation follows.
Some Unix systems define
wchar_t
as a 16-bit type and thereby follow Unicode very strictly. This definition is perfectly fine with the standard, but it also means that to represent all characters from Unicode and ISO 10646 one has to use UTF-16 surrogate characters, which is in fact a multi-wide-character encoding. But resorting to multi-wide-character encoding contradicts the purpose of thewchar_t
type.
Now, the only encoding to survive for data exchange is UTF-8
, and the maximum number of data bits that it can hold is 31
:
1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx
So, you see that in practice it is not necessary to have wint_t
as a separate type (because 4-byte (i.e., 32 bit) data types are used to store Unicode code points anyway). Maybe it has some applications for "backward compatibility" or something, but in new code it is pointless. Once again, because it defeats the purpose of having wide characters at all (and not being able to handle UTF-8 makes no sense in using wide characters nowadays).
Notice, that de-facto wint_t
is not used anyway. For example, see example in man mbstowcs
. There the variable of type wchar_t
is passed to iswlower()
and other functions from wctype.h
, which take wint_t
.