Fuzzy String Comparison

后端 未结 4 1045
情歌与酒
情歌与酒 2020-11-29 17:01

What I am striving to complete is a program which reads in a file and will compare each sentence according to the original sentence. The sentence which is a perfect match to

4条回答
  •  陌清茗
    陌清茗 (楼主)
    2020-11-29 17:32

    There is a package called fuzzywuzzy. Install via pip:

    pip install fuzzywuzzy
    

    Simple usage:

    >>> from fuzzywuzzy import fuzz
    >>> fuzz.ratio("this is a test", "this is a test!")
        96
    

    The package is built on top of difflib. Why not just use that, you ask? Apart from being a bit simpler, it has a number of different matching methods (like token order insensitivity, partial string matching) which make it more powerful in practice. The process.extract functions are especially useful: find the best matching strings and ratios from a set. From their readme:

    Partial Ratio

    >>> fuzz.partial_ratio("this is a test", "this is a test!")
        100
    

    Token Sort Ratio

    >>> fuzz.ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
        90
    >>> fuzz.token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
        100
    

    Token Set Ratio

    >>> fuzz.token_sort_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
        84
    >>> fuzz.token_set_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
        100
    

    Process

    >>> choices = ["Atlanta Falcons", "New York Jets", "New York Giants", "Dallas Cowboys"]
    >>> process.extract("new york jets", choices, limit=2)
        [('New York Jets', 100), ('New York Giants', 78)]
    >>> process.extractOne("cowboys", choices)
        ("Dallas Cowboys", 90)
    

提交回复
热议问题