Unicode string indexing in C++

后端未结

关注

 5  1643

刺人心 2020-12-30 15:05

I come from python where you can use \'string[10]\' to access a character in sequence. And if the string is encoded in Unicode it will give me expected results. However when

5条回答

清酒与你 (楼主)

2020-12-30 15:50

In my opinion, the best solution is to do any task with strings using iterators. I can't imagine a scenario where one really has to index strings: if you need indexing like ramp[5] in your example, then the 5 is usually computed in other part of the code and usually you scan all the preceding characters anyway. That's why Standard Library uses iterators in its API.

A similar problem comes up if you want to get the size of a string. Should it be character (or code point) count or merely number of bytes? Usually you need the size to allocate a buffer so byte count is more desirable. You only very, very rarely have to get Unicode character count.

If you want to process UTF-8 encoded strings using iterators then I would definitely recommend UTF8-CPP.

0 讨论(0)

查看其它5个回答
发布评论:

提交评论
- 加载中...