Levenshtein distance symmetric?

一笑奈何 提交于 2019-12-22 04:18:41

问题


I was informed Levenshtein distance is symmetric. When I used google's diffMatchPatch tool which computes Levenshtein distance among other things, the results don't imply Levenshtein distance is symmetric. i.e Levenshtein(x1,x2) is not equal to Levenshtein(x2,x1). Is Levenshtein not symmetric or is there a problem with that particular implementation? Thanks.


回答1:


Just looking at the basic algorithm it definitely is symmetric given the same cost for the operations - the number of additions, deletions and substitutions to get from a word A to a word B is the same as getting from word B to word A.

If there is a different cost on any of the operations there can be a difference though, e.g. if addition has a cost of 2 and deletion a cost of 1 to get from Zombie to Zombies results in a distance of 2, the other way round would be 1 - not symmetric.




回答2:


The classical Levenshtein algorithm is symmetric - what is an insertion going from x1 to x2 is a deletion going from x2 to x1.

Unfortunately, the algorithm is O(length(x1) * length(x2)) . After a brief look at the google's library, it seems it tries some heuristics to assure that the runtime is not too big. I think there lies Your discrepancy.




回答3:


Yes, the levenshtein distance is a distance in the proper sense, that is dist(a,b)==dist(b,a) is a part of the definition of a distance. If a function does not have this property it is not a distance function. This suggests a problem with that implementation.




回答4:


please follow the code which is implmented by myselef

public class ReadTextFile {

static void readFile(String filepath){
    CharSequence sequence1 = null;
    CharSequence sequence2 = null;

    int levenshteinDistance = 0;

    String line1 = "";
    String line2 = "";
    int minLevenshteinDistance = -1;

    try {
        BufferedReader br = new BufferedReader(new FileReader(filepath));
        String line = "";
        while((line=br.readLine())!=null)
        {               

            if(sequence1==null){
                line  = line.split(" ")[1];
                sequence1 = line;                   

                if((line=br.readLine())!=null){                 
                    line  = line.split(" ")[1];
                    sequence2 = line;                   
                }
            }else{
                sequence1 = sequence2;
                line  = line.split(" ")[1];
                sequence2 = line;                   
            }


            if(null!=sequence1 && null!=sequence2){

                levenshteinDistance = StringUtils.getLevenshteinDistance(sequence1,sequence2);

                if(minLevenshteinDistance==-1){
                    minLevenshteinDistance = levenshteinDistance;
                    line1= sequence1.toString();
                    line2= sequence2.toString();
                }else if(levenshteinDistance < minLevenshteinDistance){
                    minLevenshteinDistance = levenshteinDistance;
                    line1= sequence1.toString();
                    line2= sequence2.toString();
                }   

            }


        }

        br.close();
        System.out.println("line1 "+line1);
        System.out.println("line2 "+line2);
        System.out.println("minlevenshteinDistance "+minLevenshteinDistance);

    }catch (IOException e) {
        System.out.println(e.getMessage());
    }

}

}



来源:https://stackoverflow.com/questions/9722022/levenshtein-distance-symmetric

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!