I have a database of strings (arbitrary length) which holds more than one million items (potentially more).
I need to compare a user-provided string against the whol
This paper seems to describe exactly what you want.
Lucene (http://lucene.apache.org/) also implements Levenshtein edit distance.