I have many articles in a database (with title,text), I\'m looking for an algorithm to find the X most similar articles, something like Stack Overflow\'s \"Related Questions
SO does the comparison only on the title, not on the body text of the question, so only on rather short strings.
You can use their algorithm (no idea what it looks like) on the article title and the keywords.
If you have more cpu time to burn, also on the abstracts of your articles.