I need to compare documents stored in a DB and come up with a similarity score between 0 and 1.
The method I need to use has to be very simple. Implementing a vanil
For our Information Retrieval Course, we use some code that is written by our professor in Java. Sorry, no python port. "It is being released for educational and research purposes only under the GNU General Public License."
You can check out the documentation http://userweb.cs.utexas.edu/~mooney/ir-course/doc/
But more specifically check out: http://userweb.cs.utexas.edu/users/mooney/ir-course/doc/ir/vsr/HashMapVector.html
You can download it http://userweb.cs.utexas.edu/users/mooney/ir-course/