Boyer Moore Algorithm Understanding and Example?

后端 未结 2 765
深忆病人
深忆病人 2020-12-22 17:00

I am facing issues in understanding Boyer Moore String Search algorithm.

I am following the following document. Link

I am not able to work out my way as to e

2条回答
  •  余生分开走
    2020-12-22 17:41

    The insight behind Boyer-Moore is that if you start searching for a pattern in a string starting with the last character in the pattern, you can jump your search forward multiple characters when you hit a mismatch.

    Let's say our pattern p is the sequence of characters p1, p2, ..., pn and we are searching a string s, currently with p aligned so that pn is at index i in s.

    E.g.:

    s = WHICH FINALLY HALTS.  AT THAT POINT...
    p = AT THAT
    i =       ^
    

    The B-M paper makes the following observations:

    (1) if we try matching a character that is not in p then we can jump forward n characters:

    'F' is not in p, hence we advance n characters:

    s = WHICH FINALLY HALTS.  AT THAT POINT...
    p =        AT THAT
    i =              ^
    

    (2) if we try matching a character whose last position is k from the end of p then we can jump forward k characters:

    ' 's last position in p is 4 from the end, hence we advance 4 characters:

    s = WHICH FINALLY HALTS.  AT THAT POINT...
    p =            AT THAT
    i =                  ^
    

    Now we scan backwards from i until we either succeed or we hit a mismatch. (3a) if the mismatch occurs k characters from the start of p and the mismatched character is not in p, then we can advance (at least) k characters.

    'L' is not in p and the mismatch occurred against p6, hence we can advance (at least) 6 characters:

    s = WHICH FINALLY HALTS.  AT THAT POINT...
    p =                  AT THAT
    i =                        ^
    

    However, we can actually do better than this. (3b) since we know that at the old i we'd already matched some characters (1 in this case). If the matched characters don't match the start of p, then we can actually jump forward a little more (this extra distance is called 'delta2' in the paper):

    s = WHICH FINALLY HALTS.  AT THAT POINT...
    p =                   AT THAT
    i =                         ^
    

    At this point, observation (2) applies again, giving

    s = WHICH FINALLY HALTS.  AT THAT POINT...
    p =                       AT THAT
    i =                             ^
    

    and bingo! We're done.

提交回复
热议问题