How to use sets in Python to find list membership?

那年仲夏 提交于 2019-12-23 00:54:09

问题


Given:

A = [['Yes', 'lala', 'No'], ['Yes', 'lala', 'Idontknow'], ['No', 'lala', 'Yes'], ['No', 'lala', 'Idontknow']]

I want to know if ['Yes', X, 'No'] exist within A, where X is anything I don't care.

I attempted:

valid = False
for n in A:
    if n[0] == 'Yes' and n[2] == 'No':
        valid = True

I know set() is useful in this type of situations. But how can this be done? Is this possible? Or is it better for me to stick with my original code?


回答1:


if you want check for existance you can just ['Yes', 'No'] in A:

In [1]: A = [['Yes', 'No'], ['Yes', 'Idontknow'], ['No', 'Yes'], ['No', 'Idontknow']]

In [2]: ['Yes', 'No'] in A
Out[2]: True

for the next case try:

In [3]: A = [['Yes', 'lala', 'No'], ['Yes', 'lala', 'Idontknow'], ['No', 'lala', 'Yes'], ['No', 'lala', 'Idontknow']]

In [4]: any(i[0]=='Yes' and i[2] == 'No' for i in A)
Out[4]: True

or you can possibly define a little func:

In [5]: def want_to_know(l,item):
   ...:     for i in l:
   ...:         if i[0] == item[0] and i[2] == item[2]:
   ...:             return True
   ...:     return False

In [6]: want_to_know(A,['Yes', 'xxx', 'No'])
Out[6]: True

any(i[0]=='Yes' and i[2] == 'No' for i in A*10000) actually seems to be the 10 times faster than than the conversion itself.

In [8]: %timeit any({(x[0],x[-1]) == ('Yes','No') for x in A*10000})
100 loops, best of 3: 14 ms per loop

In [9]: % timeit {tuple([x[0],x[-1]]) for x in A*10000}
10 loops, best of 3: 33.4 ms per loop

In [10]: %timeit any(i[0]=='Yes' and i[2] == 'No' for i in A*10000)
1000 loops, best of 3: 334 us per loop



回答2:


Convert your list to set first, because it will improve the look up time from O(n) to O(1):

In [27]: A = [['Yes', 'No'], ['Yes', 'Idontknow'], ['No', 'Yes'], ['No', 'Idontknow']]

In [28]: s=set(tuple(map(tuple,A)))

In [29]: s
Out[29]: set([('Yes', 'No'), ('No', 'Idontknow'), ('Yes', 'Idontknow'), ('No', 'Yes')])

In [30]: ('Yes', 'No') in s
Out[30]: True

timeit comparisions:

%timeit ['Yes', 'No'] in A
1000000 loops, best of 3: 504 ns per loop  

%timeit ('Yes', 'No') in s
1000000 loops, best of 3: 442 ns per loop       #winner

%timeit ['No', 'Idontknow'] in A
1000000 loops, best of 3: 861 ns per loop

%timeit ('No', 'Idontknow') in s
1000000 loops, best of 3: 461 ns per loop       #winner

Edit:

If you're only interested in first and last element:

In [69]: A = [['Yes', 'No'], ['Yes', 'Idontknow','hmmm'], ['No', 'Yes'], ['No', 'Idontknow']]

In [70]: s={tuple([x[0],x[-1]]) for x in A} # -1 or 2, change as per your requirement
                                         #or set(tuple([x[0],x[-1]]) for x in A)


In [71]: s
Out[71]: set([('Yes', 'No'), ('Yes', 'hmmm'), ('No', 'Idontknow'), ('No', 'Yes')])

In [73]: ('Yes', 'hmmm') in s
Out[73]: True

timeit comparison with any() :

In [77]: %timeit ('Yes', 'hmmm') in s
1000000 loops, best of 3: 428 ns per loop      #winner

In [78]: %timeit any(x[0]=="Yes" and x[-1]=="hmmm" for x in A)
100000 loops, best of 3: 2.87 us per loop



回答3:


Set doesn't support list, you can convert it into tuple,

A = [['Yes', 'No'], ['Yes', 'Idontknow'], ['No', 'Yes'], ['No', 'Idontknow']]
valid = ('Yes', 'No') in {tuple(item) for item in A}

and as @IgnacioVazquez-Abrams mentioned, the conversion from list to tuple is O(n), so if you are aware of performance, you need to choose other methods.




回答4:


Following is how to do it using Set().

>>> A = Set([('Yes', 'No'), ('Yes', 'Idontknow'), ('No', 'Yes'), ('No', 'Idontknow')])
>>> ('Yes','No') in A
True
>>> 

The elements of Set should be hashable.. so I have used tuples as Set elements and not lists.



来源:https://stackoverflow.com/questions/14332621/how-to-use-sets-in-python-to-find-list-membership

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!