问题
I have a datastream that sends me data with an ever-increasing index (n++). It is possible for some of that data to be sent out of order, lost or otherwise need to be retransmitted.
Example
Assume I have a security log file that is monitored by my app. It is possible for a bad guy to suppress or prevent transmission of a few entries. I want to be alerted to this fact.
Also assume that this data may be sent to the log recorder out of order.
It seems this logic is everywhere I don't want to reinvent the wheel and do something less than efficient.
Question
How should I implement (or what reference implementation exists) that allows me to track data received out of order and may contain missing data in a sequence?
(Also your assistance in tagging this question is appreciated)
回答1:
Okay, well I did this using a linked list. There has to be a prior work for this somewhere... either way, this is optimized for an input series that is more or less increasing in nature.
Let me know if you see any bugs, or enhancements I can make
public class ContiguousDataValue
{
public int UpperInt { get; set; }
public int LowerInt { get; set; }
public override string ToString()
{
return "Upper" + UpperInt + " Lower" + LowerInt;
}
}
public class ContiguousData
{
LinkedList<ContiguousDataValue> ranges = new LinkedList<ContiguousDataValue>();
public void AddValue(int val)
{
for (LinkedListNode<ContiguousDataValue> range = ranges.Last; range != null; range = range.Previous)
{
if (val > range.Value.UpperInt)
{
// increment current node if applicable
if (val == range.Value.UpperInt + 1)
range.Value.UpperInt = val;
else
ranges.AddAfter(range, new ContiguousDataValue() { UpperInt = val, LowerInt = val });
return;
}
else if (val < range.Value.LowerInt)
{
if (val == range.Value.LowerInt - 1)
{
range.Value.LowerInt = val;
return;
}
else
{
continue;
}
}
}
// Anything that reaches this line is either a very new low value, or the first entry
ranges.AddLast(new ContiguousDataValue() { UpperInt = val, LowerInt = val });
}
}
回答2:
You mentioned the canonical implementation in your OQ: TCP. So sending your data over TCP has a few welcome consequences,
- whenever (ifever) data arrives out of sequence, you can safely assume, that your either your sending or your receiving process is misbehaving.
- Whenever data is missing in the sequence, you can assume the same
- The same is true for acknowledgement, so your sending process has a perfect last-known-good at all times.
I strongly advise to simply use TCP as your transport, and (if this is not directly feasable) simply encapsulate TCP datagrams into your other network stream.
In short: Make the canonical implementation your implementation.
回答3:
First of all, if you have a potential race condition, you should fix it.
TCP overcomes the problem of out-of-order data by waiting. If packet 6 arrives after packet 4, TCP will wait until packet 5 arrives. If packet 5 doesn't arrive for a certain period of time, TCP will ask for a retransmission of packet 5, which will result in packet 5 to be resent.
(Note - I know TCP/IP counts bytes and not packets, it's irrelevant here)
If you can ask your 'dumb embedded device' to retransmit, you can use the same technique. I'm betting you can't do that, so you'll need to resort to another mechanism. It can be similar to what TCP does. You just need to decide how long you're going to wait until you decide an entry is missing.
来源:https://stackoverflow.com/questions/10565487/technique-to-track-and-detect-missing-data-in-a-series-e-g-security-log-data