I am hoping I am wording this correctly to get across what I am looking for.
I need to compare two pieces of text. If the two strings are alike I would like to get s
I need to compare two pieces of text. If the two strings are alike I would like to get scores that are very alike if the strings are very different i need scores that are very different.
It really depends on what you mean by "same" or "different". For example, if someone replaces "United States of America" with "USA" in your string, is that mostly the same string (because USA is just an abbreviation for something longer), or is it very different (because a lot of characters changed)?
You essentially need to either devise a function that describes how to compute "sameness" or use a pre-existing definition thereof. For example, the aforementioned Levenshtein distance measures total difference based on the number of changes you have to make to get to the original string.