Cross-platform iteration of Unicode string (counting Graphemes using ICU)

后端 未结 3 1799
南旧
南旧 2020-12-02 19:20

I want to iterate each character of a Unicode string, treating each surrogate pair and combining character sequence as a single unit (one grapheme).

<
3条回答
  •  孤街浪徒
    2020-12-02 20:01

    Glib's ustring class gives you utf-8 strings, if using utf-8 is ok for you. It is designed to be similar to std::string. Since utf-8 is native for Linux, your task is quite easy:

    int main()
    {
        Glib::ustring s = L"नमस्ते";
        cout << s.size();
    }
    

    you can also iterate on string's characters as usual with Glib::ustring::iterator

提交回复
热议问题