Python Pandas removing substring using another column

前端 未结 3 1609
自闭症患者
自闭症患者 2020-12-17 18:58

I\'ve tried searching around and can\'t figure out an easy way to do this, so I\'m hoping your expertise can help.

I have a pandas data frame with two columns

<
3条回答
  •  [愿得一人]
    2020-12-17 19:09

    I think you want to use the replace() method that strings have, it's orders of magnitude faster than using regular expressions (I just checked quickly in IPython):

    %timeit mystr.replace("ello", "")
    The slowest run took 7.64 times longer than the fastest. This could mean that an intermediate result is being cached 
    1000000 loops, best of 3: 250 ns per loop
    
    %timeit re.sub("ello","", "e")
    The slowest run took 21.03 times longer than the fastest. This could mean that an intermediate result is being cached 
    1000000 loops, best of 3: 4.7 µs per loop
    

    If you need further speed improvements after that, you should look into numpy's vectorize function (but I think the speed up from using replace instead of regular expressions should be pretty substantial).

提交回复
热议问题