问题
I have a binary file I've loaded using an NSData object. Is there a way to locate a sequence of characters, 'abcd' for example, within that binary data and return the offset without converting the entire file to a string? Seems like it should be a simple answer, but I'm not sure how to do it. Any ideas?
I'm doing this on iOS 3 so I don't have -rangeOfData:options:range:
available.
I'm going to award this one to Sixteen Otto for suggesting strstr. I went and found the source code for the C function strstr and rewrote it to work on a fixed length Byte array--which incidentally is different from a char array as it is not null terminated. Here is the code I ended up with:
- (Byte*)offsetOfBytes:(Byte*)bytes inBuffer:(const Byte*)buffer ofLength:(int)len;
{
Byte *cp = bytes;
Byte *s1, *s2;
if ( !*buffer )
return bytes;
int i = 0;
for (i=0; i < len; ++i)
{
s1 = cp;
s2 = (Byte*)buffer;
while ( *s1 && *s2 && !(*s1-*s2) )
s1++, s2++;
if (!*s2)
return cp;
cp++;
}
return NULL;
}
This returns a pointer to the first occurrence of bytes, the thing I'm looking for, in buffer, the byte array that should contain bytes.
I call it like this:
// data is the NSData object
const Byte *bytes = [data bytes];
Byte* index = [self offsetOfBytes:tag inBuffer:bytes ofLength:[data length]];
回答1:
Convert your substring to an NSData
object, and search for those bytes in the larger NSData
using rangeOfData:options:range:. Make sure that the string encodings match!
On iPhone, where that isn't available, you may have to do this yourself. The C function strstr()
will give you a pointer to the first occurrence of a pattern within the buffer (as long as neither contain nulls!), but not the index. Here's a function that should do the job (but no promises, since I haven't tried actually running it...):
- (NSUInteger)indexOfData:(NSData*)needle inData:(NSData*)haystack
{
const void* needleBytes = [needle bytes];
const void* haystackBytes = [haystack bytes];
// walk the length of the buffer, looking for a byte that matches the start
// of the pattern; we can skip (|needle|-1) bytes at the end, since we can't
// have a match that's shorter than needle itself
for (NSUInteger i=0; i < [haystack length]-[needle length]+1; i++)
{
// walk needle's bytes while they still match the bytes of haystack
// starting at i; if we walk off the end of needle, we found a match
NSUInteger j=0;
while (j < [needle length] && needleBytes[j] == haystackBytes[i+j])
{
j++;
}
if (j == [needle length])
{
return i;
}
}
return NSNotFound;
}
This runs in something like O(nm), where n is the buffer length, and m is the size of the substring. It's written to work with NSData
for two reasons: 1) that's what you seem to have in hand, and 2) those objects already encapsulate both the actual bytes, and the length of the buffer.
回答2:
If you're using Snow Leopard, a convenient way is the new -rangeOfData:options:range: method in NSData that returns the range of the first occurrence of a piece of data. Otherwise, you can access the NSData's contents yourself using its -bytes method to perform your own search.
回答3:
I had the same problem. I solved it doing the other way round, compared to the suggestions.
first, I reformat the data (assume your NSData is stored in var rawFile) with:
NSString *ascii = [[NSString alloc] initWithData:rawFile encoding:NSAsciiStringEncoding];
Now, you can easily do string searches like 'abcd' or whatever you want using the NSScanner class and passing the ascii string to the scanner. Maybe this is not really efficient, but it works until the -rangeOfData method will be available for iPhone also.
来源:https://stackoverflow.com/questions/1834787/find-character-string-in-binary-data