问题
The Wireshark is a powerful tool for network traffic analysis. But from my practice, it can only export the processed data(which means, tell you which part is what, e.g. "data":123456 and so on) to .pcap file, but I would like to output 'data' segment in every TCP packet in real-time(or 90% real-time) to other application such as my python script for further use(may be via TCP forward?pipe?)
I don't know how to get it done exactly. Is anyone feel willing to help me with this? Thank you~
ps: did not get some snapshot cause I get nothing to show, even a code...
回答1:
tldr; Pipe tshark output in any format (-T
) into your python program and parse it there.
I am currently working on a project called pdml2flow which might be of help for you as well. For the project I rely on the pdml output (XML) from tshark. Which is piped into pdml2flow:
$ tshark -i interface -Tpdml | pdml2flow +json
I chose pdml because it was the most complete and stable when I started. But these days many output formats such as json or postscript are also possible. From tshark(1):
-T ek|fields|json|jsonraw|pdml|ps|psml|tabs|text
Set the format of the output when viewing decoded packet data. The options are one of:
ek
: Newline delimited JSON format for bulk import into Elasticsearch. It can be used with-j
or-J
including the JSON filter or with-x
to include raw hex-encoded packet data. If-P
is specified it will print the packet summary only, with both-P
and-V
it will print the packet summary and packet details. If neither-P
or-V
are used it will print the packet details only. Example of usage to import data into Elasticsearch:$ tshark -T ek -j "http tcp ip" -P -V -x -r file.pcap > file.json $ curl -H "Content-Type: application/x-ndjson" -XPOST http://elasticsearch:9200/_bulk --data-binary "@file.json"
Elastic requires a mapping file to be loaded as template for packets-* index in order to convert wireshark types to elastic types. This file can be auto-generated with the command
tshark -G elastic-mapping
. Since the mapping file can be huge, protocols can be selected by using the option--elastic-mapping-filter
:tshark -G elastic-mapping --elastic-mapping-filter ip,udp,dns
fields
: The values of fields specified with the-e
option, in a form specified by the-E
option. For example,tshark -T fields -E separator=, -E quote=d
would generate comma-separated values (CSV) output suitable for importing into your favorite spreadsheet program.
json
: JSON file format. It can be used with-j
or-J
including the JSON filter or with-x
option to include raw hex-encoded packet data. Example of usage:$ tshark -T json -r file.pcap $ tshark -T json -j "http tcp ip" -x -r file.pcap
jsonraw
: JSON file format including only raw hex-encoded packet data. It can be used with-j
including or-J
the JSON filter option. Example of usage:$ tshark -T jsonraw -r file.pcap $ tshark -T jsonraw -j "http tcp ip" -x -r file.pcap
pdml
: Packet Details Markup Language, an XML-based format for the details of a decoded packet. This information is equivalent to the packet details printed with the-V
option. Using the--color
option will add color attributes to pdml output. These attributes are nonstandard.
ps
: PostScript for a human-readable one-line summary of each of the packets, or a multi-line view of the details of each of the packets, depending on whether the-V
option was specified.
psml
: Packet Summary Markup Language, an XML-based format for the summary information of a decoded packet. This information is equivalent to the information shown in the one-line summary printed by default. Using the--color
option will add color attributes to pdml output. These attributes are nonstandard.
tabs
: Similar to the default text report except the human-readable one-line summary of each packet will include an ASCII horizontal tab (0x09) character as a delimiter between each column.
text
: Text of a human-readable one-line summary of each of the packets, or a multi-line view of the details of each of the packets, depending on whether the -V option was specified. This is the default.
This means nothing stops you from writing your own parser for any of those output formats:
$ tshark -i interface -Tjson | python your_program.py
For convenience, pdml2flow already parses pdml to a python nested dict
and provides this to your code implemented as a plugin. In such a plugin you then have full access to each frame and flow and are free to do whatever you wish.
Example plugins:
- Detect and extract base64 strings
- Write frames to Elasticsearch
The following screencast demonstrates how to create and run a new plugin in seconds:
pdml2flow implements all the building blocks to get you quickly started processing frames in python. I hope this helped and I do appreciate any feedback. Thank you.
回答2:
Consider using named pipes as a buffer for interprocess communication.
来源:https://stackoverflow.com/questions/55789013/how-to-forward-wireshark-processed-data-to-python-in-what-kind-of-method