Regular expression that finds and replaces non-ascii characters with Python

前端 未结 7 2222
無奈伤痛
無奈伤痛 2020-12-03 19:40

I need to change some characters that are not ASCII to \'_\'. For example,

Tannh‰user -> Tannh_user
  • If I use regular expression with Python, how
7条回答
  •  长情又很酷
    2020-12-03 20:19

    With the magical regex [ -~] one can solve it:

    import re
    re.sub(r"[^ -~]", "_", "Tannh‰user")
    # 'Tannh_user'
    

    Explanation:

    • The ascii characters are the symbols ranging from " " to "~" - hence [ -~] captures all ascii chars
    • By appending ^we can capture all non-ascii characters
    • The rest is now a formality

提交回复
热议问题