we often see \'related items\'. For instance in blogs we have related posts, in books we have related books, etc. My question is how do we compile those relevency? If it\'s
There are many ways to calculate similarity of two items, but for a straightforward method, take a look at the Jaccard Coefficient.
http://en.wikipedia.org/wiki/Jaccard_index
Which is: J(a,b) = intersection(a,b)/union(a,b)
So lets say you want to compute the coefficient of two items:
Item A, which has the tags "books, school, pencil, textbook, reading"
Item B, which has the tags "books, reading, autobiography"
intersection(A,B) = books, reading
union(A,B) = books, school, pencil, textbook, reading, autobiography
so J(a,b) = 2/6 = .333
So the most related item to A would be the item which results in the highest Jaccard Coefficient when paired with A.
it can also be based on "people who bought this book also bought"
No matter how, you will need some dort of connection between your items, and they will mostly be made by human beings