I have many articles in a database (with title,text), I\'m looking for an algorithm to find the X most similar articles, something like Stack Overflow\'s \"Related Questions
One common algorithm used is the Self-Organizing Map. It is a type of neural network that will automatically categorize your articles. Then you can simply find the location that a current article is in the map and all articles near it are related. The important part of the algorithm is how you would vector quantize your input. There are several ways to do with with text. You can hash your document/title, you can count words and use that as an n dimensional vector, etc. Hope that helps, although I may have opened up a Pandora's box for you of an endless journey in AI.