Elegant Algorithm for Parsing Data Stream Into Record

大兔子大兔子 提交于 2019-12-04 19:04:12

That seems pretty much how I'd do it. The only thing I might do differently is write an NSData category that does the linear search of DATA: for me, just to save the overhead of converting it to a string. It wouldn't be that hard to do, either. Something like:

@interface NSData (Search)

- (NSRange) rangeOfData:(NSData *)aData;

@end

@implementation NSData (Search)

- (NSRange) rangeOfData:(NSData *)aData {
  const void * bytes = [self bytes];
  NSUInteger length = [self length];

  const void * searchBytes = [aData bytes];
  NSUInteger searchLength = [aData length];
  NSUInteger searchIndex = 0;

  NSRange foundRange = {NSNotFound, searchLength};
  for (NSUInteger index = 0; index < length; index++) {
    if (bytes[index] == searchBytes[searchIndex]) {
      //the current character matches
      if (foundRange.location == NSNotFound) {
        foundRange.location = index;
      }
      searchIndex++;
      if (searchIndex >= searchLength) { return foundRange; }
    } else {
      searchIndex = 0;
      foundRange.location = NSNotFound;
    }
  }
  return foundRange;
}

@end

Then you can just use:

NSData * searchData = [@"DATA:" dataUsingEncoding:NSUTF8StringEncoding];
while(receivingData) {
  if ([receivedData rangeOfData:searchData].location != NSNotFound) {
    //WOOO!
  }
}

(warning: typed in a browser)

This is a classic finite state machine problem. A lot of data protocols that are stream based can be described with a finite state machine.

Basically you have a state, and transition. Boost has a finite state machine library, but it could be overkill. You can implement it as a switch.

while(stream.hasData) {
char nextInput = stream.get();
switch(currentState) {
  case D: {
     if(nextInput == A)
       currentState = A;
     else
       currentState = D; //die 
  } case A: {
    //Same for A
  }
}
}

Requested elaboration:
Basically look at the diagram below...it's a finite state machine. At any given time the machine is in exactly one state. Every time a character is input into the state machine a transition is taken, and the current state moves. (possibly back into the same state). So all you have to do is model your networked data as a finite state machine then implement that machine. There are libraries that lay it out for you, then all you have to do is implement exactly what happens on each transition. For you that you probably mean interpreting or saving the byte of data. The interpretation depends on what transition. The transition depends on the current state and the current input. Here is an example FSM.

alt text http://www.freeimagehosting.net/uploads/b1706f2a8d.png
Note that if the characters DATA: are entered the state moves to the last circle. Any other sequence will keep the state in one of first 5 states. (top row) You can also have splits. So the FSM can make decisions, so if you get a sequence like DATA2: then you can branch off of that machine into the data2: part and interpret differently in a totally different part of the machine.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!