Boyer-Moore-Horspool Algorithm for All Matches (Find Byte array inside Byte array)

北城以北 提交于 2019-12-18 12:33:04

问题


Here is my implementation of BMH algorithm (it works like a charm):

public static Int64 IndexOf(this Byte[] value, Byte[] pattern)
{
    if (value == null)
        throw new ArgumentNullException("value");

    if (pattern == null)
        throw new ArgumentNullException("pattern");

    Int64 valueLength = value.LongLength;
    Int64 patternLength = pattern.LongLength;

    if ((valueLength == 0) || (patternLength == 0) || (patternLength > valueLength))
        return -1;

    Int64[] badCharacters = new Int64[256];

    for (Int64 i = 0; i < 256; ++i)
        badCharacters[i] = patternLength;

    Int64 lastPatternByte = patternLength - 1;

    for (Int64 i = 0; i < lastPatternByte; ++i)
        badCharacters[pattern[i]] = lastPatternByte - i;

    // Beginning

    Int64 index = 0;

    while (index <= (valueLength - patternLength))
    {
        for (Int64 i = lastPatternByte; value[(index + i)] == pattern[i]; --i)
        {
            if (i == 0)
                return index;
        }

        index += badCharacters[value[(index + lastPatternByte)]];
    }

    return -1;
}

I tried to modify it in order to return all the matches instead of only the first index, but I'm getting IndexOutOfRangeException everywhere D:

Obviously I'm missing something important or I didn't properly understood how it works. What am I doing wrong?

public static List<Int64> IndexesOf(this Byte[] value, Byte[] pattern)
{
    if (value == null)
        throw new ArgumentNullException("value");

    if (pattern == null)
        throw new ArgumentNullException("pattern");

    Int64 valueLength = value.LongLength;
    Int64 patternLength = pattern.LongLength;

    if ((valueLength == 0) || (patternLength == 0) || (patternLength > valueLength))
        return (new List<Int64>());

    Int64[] badCharacters = new Int64[256];

    for (Int64 i = 0; i < 256; ++i)
        badCharacters[i] = patternLength;

    Int64 lastPatternByte = patternLength - 1;

    for (Int64 i = 0; i < lastPatternByte; ++i)
        badCharacters[pattern[i]] = lastPatternByte - i;

    // Beginning

    Int64 index = 0;
    List<Int64> indexes = new List<Int64>();

    while (index <= (valueLength - patternLength))
    {
        for (Int64 i = lastPatternByte; value[(index + i)] == pattern[i]; --i)
        {
            if (i == 0)
                indexes.Add(index);
        }

        index += badCharacters[value[(index + lastPatternByte)]];
    }

    return indexes;
}

回答1:


Change

if (i == 0)
    indexes.Add(index);

to

if (i == 0)
{
    indexes.Add(index);
    break;
}


来源:https://stackoverflow.com/questions/16252518/boyer-moore-horspool-algorithm-for-all-matches-find-byte-array-inside-byte-arra

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!