Optimized version of strstr (search has constant length)

后端 未结 5 1040
孤城傲影
孤城傲影 2020-12-31 08:42

My C program had a lot of strstr function calls. The standard library strstr is already fast but in my case the search string has always length of 5 characters. I replaced i

5条回答
  •  独厮守ぢ
    2020-12-31 08:56

    strstr's interface impose some constraints that can be beaten. It takes null-terminated strings, and any competitor that first does a "strlen" of its target will lose. It takes no "state" argument, so set-up costs can't be amortized across many calls with (say) the same target or pattern. It is expected to work on a wide range of inputs, including very short targets/patterns, and pathological data (consider searching for "ABABAC" in a string of "ABABABABAB...C"). libc is also now platform-dependent. In the x86-64 world, SSE2 is seven years old, and libc's strlen and strchr using SSE2 are 6-8 time faster than naive algorithms. On Intel platforms that support SSE4.2, strstr uses the PCMPESTRI instruction. But you can beat that, too.

    Boyer-Moore's (and Turbo B-M, and Backward Oracle Matching, et al) have set-up time that pretty much knock them out of the running, not even counting the null-terminated-string problem. Horspool is a restricted B-M that works well in practice, but doesn't do the edge cases well. Best I've found in that field is BNDM ("Backward Nondeterministic Directed-Acyclic-Word-Graph Matching"), whose implementation is smaller than its name :-)

    Here are a couple of code snippets that might be of interest. Intelligent SSE2 beats naive SSE4.2, and handles the null-termination problem. A BNDM implementation shows one way of keeping set-up costs. If you're familiar with Horspool, you'll notice the similarity, except that BNDM uses bitmasks instead of skip-offsets. I'm about to post how to solve the null-terminator problem (efficiently) for suffix algorithms like Horspool and BNDM.

    A common attribute of all good solutions is splitting into different algorithms for different argument lengths. An example of is Sanmayce's "Railgun" function.

提交回复
热议问题