KMP prefix table

前端 未结 7 1064
说谎
说谎 2020-12-02 07:58

I am reading about KMP for string matching.
It needs a preprocessing of the pattern by building a prefix table.
For example for the string ababaca

7条回答
  •  刺人心
    刺人心 (楼主)
    2020-12-02 08:27

    Every number belongs to corresponding prefix ("a", "ab", "aba", ...) and for each prefix it represents length of longest suffix of this string that matches prefix. We do not count whole string as suffix or prefix here, it is called self-suffix and self-prefix (at least in Russian, not sure about English terms).

    So we have string "ababaca". Let's look at it. KMP computes Prefix Function for every non-empty prefix. Let's define s[i] as the string, p[i] as the Prefix function. prefix and suffix may overlap.

    +---+----------+-------+------------------------+
    | i |  s[0:i]  | p[i]  | Matching Prefix/Suffix |
    +---+----------+-------+------------------------+
    | 0 | a        |     0 |                        |
    | 1 | ab       |     0 |                        |
    | 2 | aba      |     1 | a                      |
    | 3 | abab     |     2 | ab                     |
    | 4 | ababa    |     3 | aba                    |
    | 5 | ababac   |     0 |                        |
    | 6 | ababaca  |     1 | a                      |
    |   |          |       |                        |
    +---+----------+-------+------------------------+
    

    Simple C++ code that computes Prefix function of string S:

    vector prefixFunction(string s) {
        vector p(s.size());
        int j = 0;
        for (int i = 1; i < (int)s.size(); i++) {
            while (j > 0 && s[j] != s[i])
                j = p[j-1];
    
            if (s[j] == s[i])
                j++;
            p[i] = j;
        }   
        return p;
    }
    

提交回复
热议问题