WinAPI Unicode and ANSI functions

不羁的心 提交于 2019-11-28 01:34:21

The simplest rule to follow is this. Only use the ANSI variants on systems that do not have the Unicode variant. That is on Windows 95, 98 and ME which are the versions of Windows that implement Win32 and do not support Unicode.

These days it is exceptionally unlikely that you will be targeting such versions, and so in all probability you should always use the Unicode variants.

dxiv

Just as (rare) exceptions to the posted comments/answers...

One may choose to use the ANSI calls in cases where UTF-8 is expected and supported. For an example, WriteConsoleA'ing UTF-8 strings in a console set to use a TT font and running under chcp 65001.

Another oddball exception is functions that are primarily implemented as ANSI, where the Unicode "W" variant simply converts to a narrow string in the active codepage and calls the "A" counterpart. For such a function, and when a narrow string is available, calling the "A" variant directly saves a redundant double conversion. Case in point is OutputDebugString, which fell into this category until Windows 10 (I just noticed https://msdn.microsoft.com/en-us/library/windows/desktop/aa363362.aspx which mentions that a call to WaitForDebugEventEx - only available since Windows 10 - enables true Unicode output for OutputDebugStringW).

Then there are APIs which, even though dealing with strings, are natively ANSI. For example GetProcAddress only exists in the ANSI variant which takes a LPCSTR argument, since names in the export tables are narrow strings.

That said, by an large most string-related APIs are natively Unicode and one is encouraged use the "W" variants. Not all the newer APIs even have an "A" variant any longer (e.g. CommandLineToArgvW). From the horses's mouth https://msdn.microsoft.com/en-us/library/windows/desktop/ff381407.aspx:

Windows natively supports Unicode strings for UI elements, file names, and so forth. Unicode is the preferred character encoding, because it supports all character sets and languages. Windows represents Unicode characters using UTF-16 encoding, in which each character is encoded as a 16-bit value. UTF-16 characters are called wide characters, to distinguish them from 8-bit ANSI characters.

[...] When Microsoft introduced Unicode support to Windows, it eased the transition by providing two parallel sets of APIs, one for ANSI strings and the other for Unicode strings.

[...] Internally, the ANSI version translates the string to Unicode. The Windows headers also define a macro that resolves to the Unicode version when the preprocessor symbol UNICODE is defined or the ANSI version otherwise.

[...] Most newer APIs in Windows have just a Unicode version, with no corresponding ANSI version.

[ NOTE ]  The post was edited to add the last two paragraphs.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!