I am analyzing a collection of HTML webpages in Python, to do that I\'m dividing that webpages into several blocks, then I need to calculate the similarity between that bloc