Is there a better way to use strip() on a list of strings? - python [duplicate]

纵然是瞬间 提交于 2019-12-03 22:37:27

You probably shouldn't be using list as a variable name since it's a type. Regardless:

list = map(str.strip, list) 

This will apply the function str.strip to every element in list, return a new list, and store the result back in list.

karthikr

You could use list comprehensions

stripped_list = [j.strip() for j in initial_list]

Some intriguing discussions on performance happened here, so let me provide a benchmark:

http://ideone.com/ldId8

noslice_map              : 0.0814900398254
slice_map                : 0.084676027298
noslice_comprehension    : 0.0927240848541
slice_comprehension      : 0.124806165695
iter_manual              : 0.133514881134
iter_enumerate           : 0.142778873444
iter_range               : 0.160353899002

So:

  1. map(str.strip, my_list) is the fastest way, it's just a little bit faster than comperhensions.
    • Use map or itertools.imap if there's a single function that you want to apply (like str.split)
    • Use comprehensions if there's a more complicated expression
  2. Manual iteration is the slowest way; a reasonable explanation is that it requires the interpreter to do more work and the efficient C runtime does less
  3. Go ahead and assign the result like my_list[:] = map..., the slice notation introduces only a small overhead and is likely to spare you some bugs if there are multiple references to that list.
    • Know the difference between mutating a list and re-creating it.

I think you mean

a_list = [s.strip() for s in a_list]

Using a generator expression may be a better approach, like this:

stripped_list = (s.strip() for s in a_list)

offers the benefit of lazy evaluation, so the strip only runs when the given element, stripped, is needed.

If you need references to the list to remain intact outside the current scope, you might want to use list slice syntax.:

a_list[:] = [s.strip() for s in a_list]

For commenters interested in the speed of various approaches, it looks as if in CPython the generator-to-slice approach is the least efficient:

>>> from timeit import timeit as t
>>> t("""a[:]=(s.strip() for s in a)""", """a=[" %d " % s for s in range(10)]""")
4.35184121131897
>>> t("""a[:]=[s.strip() for s in a]""", """a=[" %d " % s for s in range(10)]""")
2.9129951000213623
>>> t("""a=[s.strip() for s in a]""", """a=[" %d " % s for s in range(10)]""")
2.47947096824646
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!