问题
I would like to find the first occurence of an ANSI string in a binary file, using C++.
I know the string class has a handy find function, but I don't know how can I use it if the file is big, say 5-10 MB.
Do I need to copy the whole file into a string in memory first? If yes, how can I be sure that none of the binary characters get corrupted while copying?
Or is there a more efficient way to do it, without the need for copying it into a string?
回答1:
Do I need to copy the whole file into a string in memory first?
No.
Or is there a more efficient way to do it, without the need for copying it into a string?
Of course; open the file with an std::ifstream (be sure to open in binary mode rather than text mode), create a pair of multi_pass iterators (from Boost.Spirit) around the stream, then search for the string with std::search.
回答2:
First of all, don't worry about corrupted characters. (But don't forget to open the file in binary mode either!) Now, suppose your search string is n
characters long. Then you can search the whole file a block at a time, as long as you make sure to keep the last n-1
characters of each block to prepend to the next block. That way you won't lose matches that occur across block boundaries. So you can use that handy find function without having to read the whole file into memory at once.
回答3:
if you can mmap the file into memory, you can avoid the copy.
来源:https://stackoverflow.com/questions/6447819/how-to-look-for-an-ansi-string-in-a-binary-file