I would take the;
- worst case latency as being T8 - T1 also elapse time
- processing time T6 - T3 also response time as you can start processing from the first byte and still be processing up to the last byte.
If you can't start processing the message on the server until you get the last byte, you have to use the last byte for the latency as well, otherwise its inconsistent.
I would assume the server is more highly tuned for performance than the client i.e. it might start processing from the first packet, but the client might need the whole message to do anything useful (this depends on the client)