发表新帖

发表新帖

Combining Devanagari characters

前端未结

关注

 6  2011

礼貌的吻别 2020-12-05 02:40

I have something like

a = \"बिक्रम मेरो नाम हो\"

I want to achieve something like

a[0] = बि
a[1] = क्र
a[3] = म

6条回答

悲&欢浪女 (楼主)

2020-12-05 03:26
There's a pure-Python library called uniseg which provides a number of utilities including a grapheme cluster iterator which provides the behaviour you described:
```
>>> a = u"बिक्रम मेरो नाम हो"
>>> from uniseg.graphemecluster import grapheme_clusters
>>> for i in grapheme_clusters(a): print(i)
... 
बि
क्
र
म

मे
रो

ना
म

हो
```
It claims to implement the full Unicode text segmentation algorithm described in http://www.unicode.org/reports/tr29/tr29-21.html.
0 讨论(0)

查看其它6个回答
发布评论:

提交评论
- 加载中...

热议问题