Regular expression that finds and replaces non-ascii characters with Python

前端 未结 7 2234
無奈伤痛
無奈伤痛 2020-12-03 19:40

I need to change some characters that are not ASCII to \'_\'. For example,

Tannh‰user -> Tannh_user
  • If I use regular expression with Python, how
7条回答
  •  刺人心
    刺人心 (楼主)
    2020-12-03 20:13

    Using Python's support for character encodings:

    # coding: utf8
    import codecs
    
    def underscorereplace_errors(exc):
      return (u'_', exc.end)
    
    codecs.register_error('underscorereplace', underscorereplace_errors)
    
    print u'Tannh‰user'.encode('ascii', 'underscorereplace')
    

提交回复
热议问题