Alternative to Levenshtein and Trigram
问题 Say I have the following two strings in my database: (1) 'Levi Watkins Learning Center - Alabama State University' (2) 'ETH Library' My software receives free text inputs from a data source, and it should match those free texts to the pre-defined strings in the database (the ones above). For example, if the software gets the string 'Alabama University' , it should recognize that this is more similar to (1) than it is to (2) . At first, I thought of using a well-known string metric like