What direction should I go in(libraries, documents)?
UPDATE
Can someone illustrate how to use winpcap to do the job?
UPDATE 2
How do I verify whether a packet is an HTTP one?
If by "hijack" you meant sniff the packets then what you should do to do it with WinPcap is the following:
Find the device you want to use - See WinPcap tutorial.
Open a device using
pcap_open
// Open the device char errorBuffer[PCAP_ERRBUF_SIZE]; pcap_t *pcapDescriptor = pcap_open(source, // name of the device snapshotLength, // portion of the packet to capture // 65536 guarantees that the whole packet will be captured on all the link layers attributes, // 0 for no flags, 1 for promiscuous readTimeout, // read timeout NULL, // authentication on the remote machine errorBuffer); // error buffer
Use a function that reads packets from the descriptor like
pcap_loop
int result = pcap_loop(pcapDescriptor, count, functionPointer, NULL);
This will loop until something wrong has happened or the loop was broken using a special method call. It will call the functionPointer for each packet.
In the function pointed implement something that parses the packets, it should look like a
pcap_handler
:typedef void (*pcap_handler)(u_char *, const struct pcap_pkthdr *, const u_char *);
Now all you have left is to parse the packets that their buffer is in the
const u_char*
and their length is in thepcap_pkthdr
structurecaplen
field.Assuming you have HTTP GET over TCP over IPv4 over Ethernet packets, you can:
- Skip 14 bytes of the Ethernet header.
- Skip 20 bytes of the IPv4 header (assuming there are no IPv4 options, if you suspect that IPv4 options are possible, you can read the 5-8 bits of the IPv4 header, multiply that by 4 and this would be the number of bytes the IPv4 header takes).
- Skip 20 bytes of the TCP header (assuming there are no TCP options, if you suspect that TCP options are possible, you can read the 96-99 bits of the TCP header, multiply that by 4 and this would be the number of bytes the TCP header takes).
The rest of the packet should be the HTTP text. The text between the first and second space should be the URI. If it's too long you might need to do some TCP reconstruction, but most URIs are small enough to fit in one packet.
UPDATE: In code this would look like that (I wrote it without testing it):
int tcp_len, url_length; uchar *url, *end_url, *final_url, *tcp_payload; ... /* code in http://www.winpcap.org/docs/docs_40_2/html/group__wpcap__tut6.html */ /* retireve the position of the tcp header */ ip_len = (ih->ver_ihl & 0xf) * 4; /* retireve the position of the tcp payload */ tcp_len = (((uchar*)ih)[ip_len + 12] >> 4) * 4; tcpPayload = (uchar*)ih + ip_len + tcp_len; /* start of url - skip "GET " */ url = tcpPayload + 4; /* length of url - lookfor space */ end_url = strchr((char*)url, ' '); url_length = end_url - url; /* copy the url to a null terminated c string */ final_url = (uchar*)malloc(url_length + 1); strncpy((char*)final_url, (char*)url, url_length); final_url[url_length] = '\0';
You can also filter only HTTP traffic by using creating and setting a BPF. See WinPcap tutorial. You should probably use the filter "tcp and dst port 80"
which would only give you the request your computer sends to the server.
If you don't mind using C#, you can try using Pcap.Net, which would do all that for you much more easily, including the parsing of Ethernet, IPv4 and TCP parts of the packet.
It may sound like overkill but the Web proxy/cache server Squid does exactly that. A few years ago my company used it and I had to tweak the code locally to provide some special warnings when certain URLs were accessed so I know it can do what you want. You just need to find the code you want and pull it out for your project. I used version 2.X and I see they're up to 3.X now but I suspect that aspect of the code hasn't changed much internally.
You didn't say if windows is a 'requirement' or a 'preference' but according to the site: http://www.squid-cache.org/ they can do both.
You may want to look at the source code of tcpdump
to see how it works. tcpdump
is a Linux command-line utility that monitors and prints network activity. You need root access to the machine to use it, though.
来源:https://stackoverflow.com/questions/2703238/how-to-hijack-all-local-http-request-and-extract-the-url-using-c