I have following working Java code for searching for a word against a list of words and it works perfectly and as expected:
public class Levenshtein {
privat
Since you asked, I'll show how the UMBC semantic network can do at this kind of thing. Not sure it's what you really want:
import static java.lang.String.format;
import static java.util.Comparator.comparingDouble;
import static java.util.stream.Collectors.toMap;
import static java.util.function.Function.identity;
import java.util.Map.Entry;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.Arrays;
import java.util.regex.Pattern;
public class SemanticSimilarity {
private static final String GET_URL_FORMAT
= "http://swoogle.umbc.edu/SimService/GetSimilarity?"
+ "operation=api&phrase1=%s&phrase2=%s";
private static final Pattern VALID_WORD_PATTERN = Pattern.compile("\\w+");
private static final String[] DICT = {
"cat",
"building",
"girl",
"ranch",
"drawing",
"wool",
"gear",
"question",
"information",
"tank"
};
public static String httpGetLine(String urlToRead) throws IOException {
URL url = new URL(urlToRead);
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("GET");
try (BufferedReader reader = new BufferedReader(
new InputStreamReader(conn.getInputStream()))) {
return reader.readLine();
}
}
public static double getSimilarity(String a, String b) {
if (!VALID_WORD_PATTERN.matcher(a).matches()
|| !VALID_WORD_PATTERN.matcher(b).matches()) {
throw new RuntimeException("Bad word");
}
try {
return Double.parseDouble(httpGetLine(format(GET_URL_FORMAT, a, b)));
} catch (IOException | NumberFormatException ex) {
return -1.0;
}
}
public static void test(String target) throws IOException {
System.out.println("Target: " + target);
Arrays.stream(DICT)
.collect(toMap(identity(), word -> getSimilarity(target, word)))
.entrySet().stream()
.sorted((a, b) -> Double.compare(b.getValue(), a.getValue()))
.forEach(System.out::println);
System.out.println();
}
public static void main(String[] args) throws Exception {
test("sheep");
test("vehicle");
test("house");
test("data");
test("girlfriend");
}
}
The results are kind of fascinating:
Target: sheep
ranch=0.38563728
cat=0.37816614
wool=0.36558008
question=0.047607
girl=0.0388761
information=0.027191084
drawing=0.0039623436
tank=0.0
building=0.0
gear=0.0
Target: vehicle
tank=0.65860236
gear=0.2673374
building=0.20197356
cat=0.06057514
information=0.041832563
ranch=0.017701812
question=0.017145569
girl=0.010708235
wool=0.0
drawing=0.0
Target: house
building=1.0
ranch=0.104496084
tank=0.103863
wool=0.059761923
girl=0.056549154
drawing=0.04310725
cat=0.0418914
gear=0.026439993
information=0.020329408
question=0.0012588014
Target: data
information=0.9924584
question=0.03476312
gear=0.029112043
wool=0.019744944
tank=0.014537057
drawing=0.013742204
ranch=0.0
cat=0.0
girl=0.0
building=0.0
Target: girlfriend
girl=0.70060706
ranch=0.11062875
cat=0.09766617
gear=0.04835723
information=0.02449007
wool=0.0
question=0.0
drawing=0.0
tank=0.0
building=0.0