问题
How can I test if a list contains another list (ie. it\'s a contiguous subsequence). Say there was a function called contains:
contains([1,2], [-1, 0, 1, 2]) # Returns [2, 3] (contains returns [start, end])
contains([1,3], [-1, 0, 1, 2]) # Returns False
contains([1, 2], [[1, 2], 3]) # Returns False
contains([[1, 2]], [[1, 2], 3]) # Returns [0, 0]
Edit:
contains([2, 1], [-1, 0, 1, 2]) # Returns False
contains([-1, 1, 2], [-1, 0, 1, 2]) # Returns False
contains([0, 1, 2], [-1, 0, 1, 2]) # Returns [1, 3]
回答1:
Here is my version:
def contains(small, big):
for i in xrange(len(big)-len(small)+1):
for j in xrange(len(small)):
if big[i+j] != small[j]:
break
else:
return i, i+len(small)
return False
It returns a tuple of (start, end+1) since I think that is more pythonic, as Andrew Jaffe points out in his comment. It does not slice any sublists so should be reasonably efficient.
One point of interest for newbies is that it uses the else clause on the for statement - this is not something I use very often but can be invaluable in situations like this.
This is identical to finding substrings in a string, so for large lists it may be more efficient to implement something like the Boyer-Moore algorithm.
回答2:
If all items are unique, you can use sets.
>>> items = set([-1, 0, 1, 2])
>>> set([1, 2]).issubset(items)
True
>>> set([1, 3]).issubset(items)
False
回答3:
There's an all()
and any()
function to do this.
To check if list1 contains ALL elements in list2
result = all(elem in list1 for elem in list2)
To check if list1 contains ANY elements in list2
result = any(elem in list1 for elem in list2)
the variable result would be boolean (TRUE/FALSE).
回答4:
May I humbly suggest the Rabin-Karp algorithm if the big
list is really big. The link even contains almost-usable code in almost-Python.
回答5:
After OP's edit:
def contains(small, big):
for i in xrange(1 + len(big) - len(small)):
if small == big[i:i+len(small)]:
return i, i + len(small) - 1
return False
回答6:
This works and is fairly fast since it does the linear searching using the builtin list.index()
method and ==
operator:
def contains(sub, pri):
M, N = len(pri), len(sub)
i, LAST = 0, M-N+1
while True:
try:
found = pri.index(sub[0], i, LAST) # find first elem in sub
except ValueError:
return False
if pri[found:found+N] == sub:
return [found, found+N-1]
else:
i = found+1
回答7:
Here's a straightforward algorithm that uses list methods:
#!/usr/bin/env python
def list_find(what, where):
"""Find `what` list in the `where` list.
Return index in `where` where `what` starts
or -1 if no such index.
>>> f = list_find
>>> f([2, 1], [-1, 0, 1, 2])
-1
>>> f([-1, 1, 2], [-1, 0, 1, 2])
-1
>>> f([0, 1, 2], [-1, 0, 1, 2])
1
>>> f([1,2], [-1, 0, 1, 2])
2
>>> f([1,3], [-1, 0, 1, 2])
-1
>>> f([1, 2], [[1, 2], 3])
-1
>>> f([[1, 2]], [[1, 2], 3])
0
"""
if not what: # empty list is always found
return 0
try:
index = 0
while True:
index = where.index(what[0], index)
if where[index:index+len(what)] == what:
return index # found
index += 1 # try next position
except ValueError:
return -1 # not found
def contains(what, where):
"""Return [start, end+1] if found else empty list."""
i = list_find(what, where)
return [i, i + len(what)] if i >= 0 else [] #NOTE: bool([]) == False
if __name__=="__main__":
import doctest; doctest.testmod()
回答8:
If we refine the problem talking about testing if a list contains another list with as a sequence, the answer could be the next one-liner:
def contains(subseq, inseq):
return any(inseq[pos:pos + len(subseq)] == subseq for pos in range(0, len(inseq) - len(subseq) + 1))
Here unit tests I used to tune up this one-liner:
https://gist.github.com/anonymous/6910a85b4978daee137f
回答9:
Smallest code:
def contains(a,b):
str(a)[1:-1].find(str(b)[1:-1])>=0
回答10:
Here is my answer. This function will help you to find out whether B is a sub-list of A. Time complexity is O(n).
`def does_A_contain_B(A, B): #remember now A is the larger list
b_size = len(B)
for a_index in range(0, len(A)):
if A[a_index : a_index+b_size]==B:
return True
else:
return False`
回答11:
I tried to make this as efficient as possible.
It uses a generator; those unfamiliar with these beasts are advised to check out their documentation and that of yield expressions.
Basically it creates a generator of values from the subsequence that can be reset by sending it a true value. If the generator is reset, it starts yielding again from the beginning of sub
.
Then it just compares successive values of sequence
with the generator yields, resetting the generator if they don't match.
When the generator runs out of values, i.e. reaches the end of sub
without being reset, that means that we've found our match.
Since it works for any sequence, you can even use it on strings, in which case it behaves similarly to str.find
, except that it returns False
instead of -1
.
As a further note: I think that the second value of the returned tuple should, in keeping with Python standards, normally be one higher. i.e. "string"[0:2] == "st"
. But the spec says otherwise, so that's how this works.
It depends on if this is meant to be a general-purpose routine or if it's implementing some specific goal; in the latter case it might be better to implement a general-purpose routine and then wrap it in a function which twiddles the return value to suit the spec.
def reiterator(sub):
"""Yield elements of a sequence, resetting if sent ``True``."""
it = iter(sub)
while True:
if (yield it.next()):
it = iter(sub)
def find_in_sequence(sub, sequence):
"""Find a subsequence in a sequence.
>>> find_in_sequence([2, 1], [-1, 0, 1, 2])
False
>>> find_in_sequence([-1, 1, 2], [-1, 0, 1, 2])
False
>>> find_in_sequence([0, 1, 2], [-1, 0, 1, 2])
(1, 3)
>>> find_in_sequence("subsequence",
... "This sequence contains a subsequence.")
(25, 35)
>>> find_in_sequence("subsequence", "This one doesn't.")
False
"""
start = None
sub_items = reiterator(sub)
sub_item = sub_items.next()
for index, item in enumerate(sequence):
if item == sub_item:
if start is None: start = index
else:
start = None
try:
sub_item = sub_items.send(start is None)
except StopIteration:
# If the subsequence is depleted, we win!
return (start, index)
return False
回答12:
I think this one is fast...
def issublist(subList, myList, start=0):
if not subList: return 0
lenList, lensubList = len(myList), len(subList)
try:
while lenList - start >= lensubList:
start = myList.index(subList[0], start)
for i in xrange(lensubList):
if myList[start+i] != subList[i]:
break
else:
return start, start + lensubList - 1
start += 1
return False
except:
return False
回答13:
The problem of most of the answers, that they are good for unique items in list. If items are not unique and you still want to know whether there is an intersection, you should count items:
from collections import Counter as count
def listContains(l1, l2):
list1 = count(l1)
list2 = count(l2)
return list1&list2 == list1
print( listContains([1,1,2,5], [1,2,3,5,1,2,1]) ) # Returns True
print( listContains([1,1,2,8], [1,2,3,5,1,2,1]) ) # Returns False
You can also return the intersection by using ''.join(list1&list2)
来源:https://stackoverflow.com/questions/3847386/testing-if-a-list-contains-another-list-with-python