string-metric

Alternative to Levenshtein and Trigram

半城伤御伤魂 提交于 2020-01-01 04:17:07
问题 Say I have the following two strings in my database: (1) 'Levi Watkins Learning Center - Alabama State University' (2) 'ETH Library' My software receives free text inputs from a data source, and it should match those free texts to the pre-defined strings in the database (the ones above). For example, if the software gets the string 'Alabama University' , it should recognize that this is more similar to (1) than it is to (2) . At first, I thought of using a well-known string metric like

Mapping arbitrary strings to RGB values

好久不见. 提交于 2019-11-30 19:16:41
I have a huge set of arbitrary natural language strings. For my tool to analyze them I need to convert each string to unique color value (RGB or other). I need color contrast to depend on string similarity (the more string is different from other, the more their respective colors should be different). Would be perfect if I would always get same color value for the same string. Any advice on how to approach this problem? Update on distance between strings I probably need "similarity" defined as a Levenstein-like distance. No natural language parsing is required. That is: "I am going to the

Mapping arbitrary strings to RGB values

拟墨画扇 提交于 2019-11-30 03:25:24
问题 I have a huge set of arbitrary natural language strings. For my tool to analyze them I need to convert each string to unique color value (RGB or other). I need color contrast to depend on string similarity (the more string is different from other, the more their respective colors should be different). Would be perfect if I would always get same color value for the same string. Any advice on how to approach this problem? Update on distance between strings I probably need "similarity" defined

How to compare almost similar Strings in Java? (String distance measure) [closed]

本秂侑毒 提交于 2019-11-27 17:31:24
I would like to compare two strings and get some score how much these look alike. For example "The sentence is almost similar" and "The sentence is similar" . I'm not familiar with existing methods in Java, but for PHP I know the levenshtein function . Are there better methods in Java? Joey The Levensthein distance is a measure for how similar strings are. Or, more precisely, how many alterations have to be made that they are the same. The algorithm is available in pseudo-code on Wikipedia. Converting that to Java shouldn't be much of a problem, but it's not built-in into the base class

How to compare almost similar Strings in Java? (String distance measure) [closed]

点点圈 提交于 2019-11-26 22:33:49
问题 Closed. This question is off-topic. It is not currently accepting answers. Want to improve this question? Update the question so it's on-topic for Stack Overflow. Closed 12 months ago . I would like to compare two strings and get some score how much these look alike. For example "The sentence is almost similar" and "The sentence is similar" . I'm not familiar with existing methods in Java, but for PHP I know the levenshtein function. Are there better methods in Java? 回答1: The Levensthein