Writing a Wireshark dissector to count number of TCP flows

对着背影说爱祢 提交于 2019-12-13 02:13:50

问题


I have a very large tcpdump file that I split into 1 minute intervals. I am able to use tshark to extract TCP statistics for each of the 1 minute files using a loop code and save the results as a CSV file so I can perform further analysis in Excel. Now I want to be able to count the number of TCP flows in each 1 minute file for all the 1 minute files and save the data in a CSV file. A TCP flow here represents group of packets going from a specific source to a specific destination. Each flow has statistics such as source IP, dest IP, #pcakets from A->B, #bytes from A->B, #packets from B->A, #bytes from B->A, total packets, total bytes, etc. And I just want to count the number of TCP flows in each of the 1 minute files. From what I’ve read so far, it seems I need to create a dissector to do that. Can anyone give me pointers or code on how to get started? Thanks.


回答1:


Tshark has a command to dump all of the necessary information: tshark -qz conv,tcp -r FILE. This writes one line per flow (plus a header and footer) so to count the flows just count the lines and subtract the header/footer.




回答2:


Not a dissector, but a tap. See the Wireshark README.tapping document, and see the TShark iousers tap for a, sadly, not at all simple example in C.

It's also possible to write taps in Lua; see, for example, the Lua/Taps page in the Wireshark Wiki and the Lua Support in Wireshark section of the Wireshark User's Manual.

The C structure passed to TCP taps for each packet is:

/* the tcp header structure, passed to tap listeners */
 typedef struct tcpheader {   
        guint32 th_seq;
        guint32 th_ack;   
        gboolean th_have_seglen;        /* TRUE if th_seglen is valid */
        guint32 th_seglen;
        guint32 th_win;   /* make it 32 bits so we can handle some scaling */
        guint16 th_sport;  
        guint16 th_dport;
        guint8  th_hlen;
        guint16 th_flags;
        guint32 th_stream; /* this stream index field is included to help differentiate when address/port pairs are reused */
        address ip_src;
        address ip_dst;

        /* This is the absolute maximum we could find in TCP options (RFC2018, section 3) */
        #define MAX_TCP_SACK_RANGES 4
        guint8  num_sack_ranges;
        guint32 sack_left_edge[MAX_TCP_SACK_RANGES];
        guint32 sack_right_edge[MAX_TCP_SACK_RANGES];
} tcp_info_t;

So, for C-language taps, the "data" argument to the tap listener's "packet" routine points to a structure of that sort.

For Lua taps, the "tapinfo" table passed as the third argument to the tap listener's "packet" routine is described as "a table of info based on the Listener's type, or nil.". For a TCP tap, the entries in the table include all the fields in that structure except for sack_left_edge and sack_right_edge; the keys in the table are the structure member names.

The th_stream field identifies the connection; each time the TCP dissector finds a new connection, it assigns a new value. As the comment indicates, "this stream index field is included to help differentiate when address/port pairs are reused", so that if a given connection is closed, and a later connection uses the same endpoints, the two connections have different th_stream values even though they have the same endpoints.

So you'd have a table using the th_stream value as a key. The table would store the endpoints (addresses and ports) and counts of packets and bytes in each direction. For each packet passed to the listener's "packet" routine, you'd look up the th_stream value in the table and, if you don't find it, you'd create a new entry, starting the counts off at zero, and use that new entry; otherwise, you'd use the entry you found. You'd then figure out whether the packet was going from A to B or B to A, and increase the appropriate packet count and byte count.

You'd also keep track of the time stamp. For the first packet, you'd store the time stamp for that packet. For each packet, you'd look at the time stamp and, if it's one minute or more later than the stored time stamp, you'd:

  • dump out the statistics from the table of connections;
  • empty out the table of connections;
  • store the new packet's time stamp, replacing the previous stored time stamp.


来源:https://stackoverflow.com/questions/23877516/writing-a-wireshark-dissector-to-count-number-of-tcp-flows

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!