Deleting specific control characters(\n \r \t) from a string

后端 未结 6 1131
鱼传尺愫
鱼传尺愫 2021-02-20 11:36

I have quite large amount of text which include control charachters like \\n \\t and \\r. I need to replace them with a simple space--> \" \". What is the fastest way to do this

6条回答
  •  你的背包
    2021-02-20 12:07

    I think the fastest way is to use str.translate():

    import string
    s = "a\nb\rc\td"
    print s.translate(string.maketrans("\n\t\r", "   "))
    

    prints

    a b c d
    

    EDIT: As this once again turned into a discussion about performance, here some numbers. For long strings, translate() is way faster than using regular expressions:

    s = "a\nb\rc\td " * 1250000
    
    regex = re.compile(r'[\n\r\t]')
    %timeit t = regex.sub(" ", s)
    # 1 loops, best of 3: 1.19 s per loop
    
    table = string.maketrans("\n\t\r", "   ")
    %timeit s.translate(table)
    # 10 loops, best of 3: 29.3 ms per loop
    

    That's about a factor 40.

提交回复
热议问题