how to cluster users based on tags
问题 I'd like to cluster users based on the categories or tags of shows they watch. What's the easiest/best algorithm to do this? Assuming I have around 20,000 tags and several million watch events I can use as signals, is there an algorithm I can implement using say pig/hadoop/mortar or perhaps on neo4j? In terms of data I have users, programs they've watched, and the tags that a program has (usually around 10 tags per program). I would like to expect at the end k number of clusters (maybe a