I have a set of strings, e.g.
my_prefix_what_ever
my_prefix_what_so_ever
my_prefix_doesnt_matter
I simply want to find the longest common p
Never rewrite what is provided to you: os.path.commonprefix does exactly this:
Return the longest path prefix (taken character-by-character) that is a prefix of all paths in list. If list is empty, return the empty string (
''
). Note that this may return invalid paths because it works a character at a time.
For comparison to the other answers, here's the code:
# Return the longest prefix of all list elements.
def commonprefix(m):
"Given a list of pathnames, returns the longest common leading component"
if not m: return ''
s1 = min(m)
s2 = max(m)
for i, c in enumerate(s1):
if c != s2[i]:
return s1[:i]
return s1
The following is an working, but probably quite inefficient solution.
a = ["my_prefix_what_ever", "my_prefix_what_so_ever", "my_prefix_doesnt_matter"]
b = zip(*a)
c = [x[0] for x in b if x==(x[0],)*len(x)]
result = "".join(c)
For small sets of strings, the above is no problem at all. But for larger sets, I personally would code another, manual solution that checks each character one after another and stops when there are differences.
Algorithmically, this yields the same procedure, however, one might be able to avoid constructing the list c
.
Ned Batchelder is probably right. But for the fun of it, here's a more efficient version of phimuemue's answer using itertools
.
import itertools
strings = ['my_prefix_what_ever',
'my_prefix_what_so_ever',
'my_prefix_doesnt_matter']
def all_same(x):
return all(x[0] == y for y in x)
char_tuples = itertools.izip(*strings)
prefix_tuples = itertools.takewhile(all_same, char_tuples)
''.join(x[0] for x in prefix_tuples)
As an affront to readability, here's a one-line version :)
>>> from itertools import takewhile, izip
>>> ''.join(c[0] for c in takewhile(lambda x: all(x[0] == y for y in x), izip(*strings)))
'my_prefix_'