There are multibyte string functions in PHP to handle multibyte string (e.g:CJK script). For example, I want to count how many letters in a multi bytes string by using
Use Unicode strings:
# Encoding: UTF-8 japanese = u"桜の花びらたち" print japanese print len(japanese)
Note the u in front of the string.
u
To convert a bytestring into Unicode, use decode: "桜の花びらたち".decode('utf-8')
decode
"桜の花びらたち".decode('utf-8')