I’d recently found myself doing some packet analysis in python using pypcap and dpkt. These are very powerful tools, but they are woefully undocumented so I thought I’d put up a simple example of some packet capture with them.
First off, they require a bit of setup. You’ll need to install libpcap. On Ubuntu it’s as simple as
$ sudo apt-get install libpcap-dev
After that’s done you can simply use pip or easy_install to get pypcap and dkpt
$ sudo pip install pypcap
$ sudo pip install dpkt
First you want to create a pcap object and set up any filters you need for the capture. The pcap constructor takes a few parameters
- name: the name of the interface to listen on, such as ‘eth0′ or ‘en0′
- timeout_ms: The number of milliseconds to wait for traffic before timing out
- immediate: Defaults to False. Setting this to true removes buffering and will return packets to you as they come in instead of in batches.
pc = pcap.pcap(name="eth0", timeout_ms=10000, immediate=True)
You then write a callback function to handle each packet:
def __packet_handler(ts, pkt):
eth = dpkt.ethernet.Ethernet(pkt)
if eth.type != dpkt.ethernet.ETH_TYPE_IP:
# skip non-IP packets
ip = eth.data
# Grab the source IP off of the packet.
# It's returned as 4 byte string, e.g. '\x01\x02\x03\x04'
# This gives you the IP as an integer, not in dotted decimal form.
src_ip = struct.unpack("<L", ip.src)
In this example, if you’re looking for TCP or UDP information, it would be contained in ip.data you can check the type to determine which one. This function will just print the source IP addresses as integers.
To start collecting packets you call the loop function. loop takes 2 parameters
- The number of packets to collect. 0 or -1 means continuous
- The callback function to fire for each packet
It’s important to note that if there’s no traffic matching your filter in the timeout that you set in the pcap constructor, the function will return with no results. If you want to truly capture indefinitely, I recommend setting a high timeout like 60,000ms and putting your loop call inside a while loop with any necessary break clauses. Note that there is a small potential to miss traffic if you hit your timeout and re-call loop() due to a small initialization time, which is a good reason for the long timeout.