Such as:
str = \'sdf344asfasf天地方益3権sdfsdf\'
Add ()
to Chinese and Japanese Characters:
strAfterConvert = \'sdfasf
You can do the edit using the regex package, which supports checking the Unicode "Script" property of each character and is a drop-in replacement for the re
package:
import regex as re
pattern = re.compile(r'([\p{IsHan}\p{IsBopo}\p{IsHira}\p{IsKatakana}]+)', re.UNICODE)
input = u'sdf344asfasf天地方益3権sdfsdf'
output = pattern.sub(r'(\1)', input)
print output # Prints: sdf344asfasf(天地方益)3(権)sdfsdf
You should adjust the \p{Is...}
sequences with the character scripts/blocks that you consider to be "Chinese or Japanese".