How can I use wstring(s) in Linux APIs?

淺唱寂寞╮ 提交于 2019-12-10 15:49:19

问题


I want to develope an application in Linux. I want to use wstring beacuse my application should supports unicode and I don't want to use UTF-8 strings.

In Windows OS, using wstring is easy. beacuse any ANSI API has a unicode form. for example there are two CreateProcess API, first API is CreateProcessA and second API is CreateProcessW.

wstring app = L"C:\\test.exe";
CreateProcess
(
  app.c_str(), // EASY!
  ....
);

But it seems working with wstring in Linux is complicated! for example there is an API in Linux called parport_open (It just an example).

and I don't know how to send my wstring to this API (or APIs like parport_open that accept a string parameter).

wstring name = L"myname";
parport_open
(
  0, // or a valid number. It is not important in this question.
  name.c_str(), // Error: because type of this parameter is char* not wchat_t*
  ....
);

My question is how can I use wstring(s) in Linux APIs?

Note: I don't want to use UTF-8 strings.

Thanks


回答1:


Linux APIs (on recent kernels and with correct locale setting) on almost every distribution use UTF-8 strings by default1. You too should use them inside your code. Resistance is futile.

The wchar_t (and thus wstring) on Windows were convenient only when Unicode was limited to 65536 characters (i.e. wchar_t were used for UCS-2), now that the 16-bit Windows wchar_t are used for UTF-16 the advantage of 1 wchar_t=1 Unicode character is long gone, so you have the same disadvantages of using UTF-8. Nowadays IMHO the Linux approach is the most correct. (Another answer of mine on UTF-16 and why Windows and Java use it)

By the way, both string and wstring aren't encoding-aware, so you can't reliably use any of these two to manipulate Unicode code points. I heard that wxString from the wxWidgets toolkit handles UTF-8 nicely, but I never did extensive research about it.


  1. actually, as pointed out below, the kernel aims to be encoding-agnostic, i.e. it treats the strings as opaque sequences of (NUL-terminated?) bytes (and that's why encodings that use "larger" character types like UTF-16 cannot be used). On the other hand, wherever actual string manipulation is done, the current locale setting is used, and by default on almost any modern Linux distribution it is set to UTF-8 (which is a reasonable default to me).



回答2:


I don't want to use UTF-8 strings.

Well, you will need to overcome that reluctance, at least when calling the APIs. Linux uses single byte string encodings, invariably UTF-8. Clearly you should use a single byte string type since you obviously can't pass wide characters to a function that expects char*. Use string rather than wstring.



来源:https://stackoverflow.com/questions/7299760/how-can-i-use-wstrings-in-linux-apis

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!