Combining Devanagari characters

前端 未结 6 2011
礼貌的吻别
礼貌的吻别 2020-12-05 02:40

I have something like

a = \"बिक्रम मेरो नाम हो\"

I want to achieve something like

a[0] = बि
a[1] = क्र
a[3] = म
         


        
6条回答
  •  悲&欢浪女
    2020-12-05 03:26

    There's a pure-Python library called uniseg which provides a number of utilities including a grapheme cluster iterator which provides the behaviour you described:

    >>> a = u"बिक्रम मेरो नाम हो"
    >>> from uniseg.graphemecluster import grapheme_clusters
    >>> for i in grapheme_clusters(a): print(i)
    ... 
    बि
    क्
    र
    म
    
    मे
    रो
    
    ना
    म
    
    हो
    

    It claims to implement the full Unicode text segmentation algorithm described in http://www.unicode.org/reports/tr29/tr29-21.html.

提交回复
热议问题