Python: How can I replace full-width characters with half-width characters?

后端 未结 6 1544
清酒与你
清酒与你 2020-12-16 00:51

If this was PHP, I would probably do something like this:

function no_more_half_widths($string){
  $foo = array(\'1\',\'2\',\'3\',\'4\',\'5\',\'6\',\'7\',\'8         


        
6条回答
  •  青春惊慌失措
    2020-12-16 01:07

    In Python3, you can use the following snippet. It made a map between all ascii characters and its corresponding fullwidth character. Best of all, this doesn't need you to hard type ascii sequence, which is quite error prone.

     #! /usr/bin/env python3
     # -*- coding: utf-8 -*-     
    
     FULL2HALF = dict((i + 0xFEE0, i) for i in range(0x21, 0x7F))
     FULL2HALF[0x3000] = 0x20
    
     def halfen(s):
         '''
         Convert full-width characters to ASCII counterpart
         '''
         return str(s).translate(FULL2HALF)
    

    Also, with same logic, you can convert halfwidth characters to fullwidth character, the following code shows the trick:

     #! /usr/bin/env python3
     # -*- coding: utf-8 -*-
    
     HALF2FULL = dict((i, i + 0xFEE0) for i in range(0x21, 0x7F))
     HALF2FULL[0x20] = 0x3000
    
     def fullen(s):
         '''
         Convert all ASCII characters to the full-width counterpart.
         '''
         return str(s).translate(HALF2FULL)
    

    Note: this two snippets only consider ascii characters, and does not convert any japanese/korean fullwidth character.

    For completeness, from wikepedia:

    Range U+FF01–FF5E reproduces the characters of ASCII 21 to 7E as fullwidth forms, that is, a fixed width form used in CJK computing. This is useful for typesetting Latin characters in a CJK environment. U+FF00 does not correspond to a fullwidth ASCII 20 (space character), since that role is already fulfilled by U+3000 "ideographic space."

    Range U+FF65–FFDC encodes halfwidth forms of Katakana and Hangul characters.

    Range U+FFE0–FFEE includes fullwidth and halfwidth symbols.

    And more, python2 solution can refer to gist/jcayzac

提交回复
热议问题