Levenshtein distance symmetric?

落爺英雄遲暮 提交于 2019-12-05 02:46:58

Just looking at the basic algorithm it definitely is symmetric given the same cost for the operations - the number of additions, deletions and substitutions to get from a word A to a word B is the same as getting from word B to word A.

If there is a different cost on any of the operations there can be a difference though, e.g. if addition has a cost of 2 and deletion a cost of 1 to get from Zombie to Zombies results in a distance of 2, the other way round would be 1 - not symmetric.

The classical Levenshtein algorithm is symmetric - what is an insertion going from x1 to x2 is a deletion going from x2 to x1.

Unfortunately, the algorithm is O(length(x1) * length(x2)) . After a brief look at the google's library, it seems it tries some heuristics to assure that the runtime is not too big. I think there lies Your discrepancy.

Yes, the levenshtein distance is a distance in the proper sense, that is dist(a,b)==dist(b,a) is a part of the definition of a distance. If a function does not have this property it is not a distance function. This suggests a problem with that implementation.

please follow the code which is implmented by myselef

public class ReadTextFile {

static void readFile(String filepath){
    CharSequence sequence1 = null;
    CharSequence sequence2 = null;

    int levenshteinDistance = 0;

    String line1 = "";
    String line2 = "";
    int minLevenshteinDistance = -1;

    try {
        BufferedReader br = new BufferedReader(new FileReader(filepath));
        String line = "";
        while((line=br.readLine())!=null)
        {               

            if(sequence1==null){
                line  = line.split(" ")[1];
                sequence1 = line;                   

                if((line=br.readLine())!=null){                 
                    line  = line.split(" ")[1];
                    sequence2 = line;                   
                }
            }else{
                sequence1 = sequence2;
                line  = line.split(" ")[1];
                sequence2 = line;                   
            }


            if(null!=sequence1 && null!=sequence2){

                levenshteinDistance = StringUtils.getLevenshteinDistance(sequence1,sequence2);

                if(minLevenshteinDistance==-1){
                    minLevenshteinDistance = levenshteinDistance;
                    line1= sequence1.toString();
                    line2= sequence2.toString();
                }else if(levenshteinDistance < minLevenshteinDistance){
                    minLevenshteinDistance = levenshteinDistance;
                    line1= sequence1.toString();
                    line2= sequence2.toString();
                }   

            }


        }

        br.close();
        System.out.println("line1 "+line1);
        System.out.println("line2 "+line2);
        System.out.println("minlevenshteinDistance "+minLevenshteinDistance);

    }catch (IOException e) {
        System.out.println(e.getMessage());
    }

}

}

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!