Python: Elegant and efficient ways to mask a list

a 夏天 提交于 2019-11-27 21:24:13

You are looking for itertools.compress

Example from the docs

Equivalent to:

def compress(data, selectors):
    # compress('ABCDEF', [1,0,1,0,1,1]) --> A C E F
    return (d for d, s in izip(data, selectors) if s)

Since jamylak already answered the question with a practical answer, here is my example of a list with builtin masking support (totally unnecessary, btw):

from itertools import compress
class MaskableList(list):
    def __getitem__(self, index):
        try: return super(MaskableList, self).__getitem__(index)
        except TypeError: return MaskableList(compress(self, index))

Usage:

>>> myList = MaskableList(range(10))
>>> myList
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> mask = [0, 1, 1, 0]
>>> myList[mask]
[1, 2]

Note that compress stops when either the data or the mask runs out. If you wish to keep the portion of the list that extends past the length of the mask, you could try something like:

from itertools import izip_longest

[i[0] for i in izip_longest(myList, mask[:len(myList)], fillvalue=True) if i[1]]

If you are using Numpy, you can do it easily using Numpy array without installing any other library:

>> a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>> msk = [ True, False, False,  True,  True,  True,  True, False, False, False]
>> a = np.array(a) # convert list to numpy array
>> result = a[msk] # mask a
>> result.tolist()
[0, 3, 4, 5, 6]
Jim

i don't consider it elegant. It's compact, but tends to be confusing, as the construct is very different than most languages.

As Rossum has said about language design, we spend more time reading it than writing it. The more obscure the construction of a line of code, the more confusing it becomes to others, who may lack familiarity with Python, even though they have full competency in any number of other languages.

Readability trumps short form notations everyday in the real world of servicing code. Just like fixing your car. Big drawings with lots of information make troubleshooting a lot easier.

For me, I would much rather troubleshoot someone's code that uses the long form

print [lst[i] for i in xrange(len(lst)) if msk[i]]

than the numpy short notation mask. I don't need to have any special knowledge of a specific Python package to interpret it.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!