I have two lists:
eg. a = [1,8,3,9,4,9,3,8,1,2,3] and b = [1,8,1,3,9,4,9,3,8,1,2,3]
Both contain ints. There is no meaning behind the ints (eg. 1 is not \'cl
I've implemented something for a similar task a long time ago. Now, I have only a blog entry for that. It was simple: you had to compute the pdf of both sequences then it would find the common area covered by the graphical representation of pdf.
Sorry for the broken images on link, the external server that I've used back then is dead now.
Right now, for your problem the code translates to
def overlap(pdf1, pdf2):
s = 0
for k in pdf1:
if pdf2.has_key(k):
s += min(pdf1[k], pdf2[k])
return s
def pdf(l):
d = {}
s = 0.0
for i in l:
s += i
if d.has_key(i):
d[i] += 1
else:
d[i] = 1
for k in d:
d[k] /= s
return d
def solve():
a = [1, 8, 3, 9, 4, 9, 3, 8, 1, 2, 3]
b = [1, 8, 1, 3, 9, 4, 9, 3, 8, 1, 2, 3]
pdf_a = pdf(a)
pdf_b = pdf(b)
print pdf_a
print pdf_b
print overlap(pdf_a, pdf_b)
print overlap(pdf_b, pdf_a)
if __name__ == '__main__':
solve()
Unfortunately, it gives an unexpected answer, only 0.212292609351