I have many articles in a database (with title,text), I\'m looking for an algorithm to find the X most similar articles, something like Stack Overflow\'s \"Related Questions
The simplest and fastest way to compare similarity among abstracts is probably by utilizing the set concept. First convert abstract texts into set of words. Then check how much each set overlaps. Python's set feature comes very hand performing this task. You would be surprised to see how well this method compares to those "similar/related papers" options out there provided by GScholar, ADS, WOS or Scopus.