Skip to content Skip to sidebar Skip to footer

Parsing Pcap Files With Dpkt (python)

I'm trying to parse a previously-captured trace for HTTP headers using the dpkt module: import dpkt import sys f=file(sys.argv[1],'rb') pcap=dpkt.pcap.Reader(f) for ts, buf in

Solution 1:

I have encountered the same problem while working with HTTP Requests and dpkt.

The problem is that the dpkt's HTTP headers parser uses wrong logic. This exception is raised when the HTTP doesn't end with \r\n\r\n. (And as you say, there are a lot of good packets with no \r\n\r\n at the end.)

Here is the bug report to your problem.

Solution 2:

In your python code, before assignment ip=eth.data check it that whether the Ethernet type is IP or not. If the Ethernet type is not ip do nothing to that packet. And check whether IP protocol is TCP protocol

        To Check
               1. IP packet or not
               2. TCP protocol or not

modified your program code

 
............            
      eth=dpkt.ethernet.Ethernet(buf)          
      ip=eth.data  
      tcp=ip.data      
      ........   

as

    
............         
     eth=dpkt.ethernet.Ethernet(buf)  
     if eth.type!=2048: #For ipv4, dpkt.ethernet.Ethernet(buf).type =2048        
           continue         
     ip=eth.data
     if ip.p!=6:
           continue
     tcp=ip.data        
     .......
and see whether there is any error issue.        

with regard, Irengbam Tilokchan Singh

Solution 3:

I've added an example to dpkt that parses and displays HTTP Headers. The docs can be found here: http://dpkt.readthedocs.io/en/latest/print_http_requests.html and the example code can be found in dpkt/examples/print_http_requests.py

# For each packet in the pcap process the contentsfor timestamp, buf in pcap:

    # Unpack the Ethernet frame (mac src/dst, ethertype)
    eth = dpkt.ethernet.Ethernet(buf)

    # Make sure the Ethernet data contains an IP packetifnotisinstance(eth.data, dpkt.ip.IP):
        print'Non IP Packet type not supported %s\n' % eth.data.__class__.__name__
        continue# Now grab the data within the Ethernet frame (the IP packet)
    ip = eth.data

    # Check for TCP in the transport layerifisinstance(ip.data, dpkt.tcp.TCP):

        # Set the TCP data
        tcp = ip.data

        # Now see if we can parse the contents as a HTTP requesttry:
            request = dpkt.http.Request(tcp.data)
        except (dpkt.dpkt.NeedData, dpkt.dpkt.UnpackError):
            continue# Pull out fragment information (flags and offset all packed into off field, so use bitmasks)
        do_not_fragment = bool(ip.off & dpkt.ip.IP_DF)
        more_fragments = bool(ip.off & dpkt.ip.IP_MF)
        fragment_offset = ip.off & dpkt.ip.IP_OFFMASK

        # Print out the infoprint'Timestamp: ', str(datetime.datetime.utcfromtimestamp(timestamp))
        print'Ethernet Frame: ', mac_addr(eth.src), mac_addr(eth.dst), eth.typeprint'IP: %s -> %s   (len=%d ttl=%d DF=%d MF=%d offset=%d)' % \
              (inet_to_str(ip.src), inet_to_str(ip.dst), ip.len, ip.ttl, do_not_fragment, more_fragments, fragment_offset)
        print'HTTP request: %s\n' % repr(request)

Example Output

Timestamp:  2004-05-1310:17:08.222534
Ethernet Frame:  00:00:01:00:00:00 fe:ff:20:00:01:002048
IP: 145.254.160.237 -> 65.208.228.223   (len=519 ttl=128 DF=1 MF=0 offset=0)
HTTP request: Request(body='', uri='/download.html', headers={'accept-language': 'en-us,en;q=0.5', 'accept-encoding': 'gzip,deflate', 'connection': 'keep-alive', 'keep-alive': '300', 'accept': 'text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1', 'user-agent': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113', 'accept-charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.7', 'host': 'www.ethereal.com', 'referer': 'http://www.ethereal.com/development.html'}, version='1.1', data='', method='GET')

Timestamp:  2004-05-1310:17:10.295515
Ethernet Frame:  00:00:01:00:00:00 fe:ff:20:00:01:002048
IP: 145.254.160.237 -> 216.239.59.99   (len=761 ttl=128 DF=1 MF=0 offset=0)
HTTP request: Request(body='', uri='/pagead/ads?client=ca-pub-2309191948673629&random=1084443430285&lmt=1082467020&format=468x60_as&output=html&url=http%3A%2F%2Fwww.ethereal.com%2Fdownload.html&color_bg=FFFFFF&color_text=333333&color_link=000000&color_url=666633&color_border=666633', headers={'accept-language': 'en-us,en;q=0.5', 'accept-encoding': 'gzip,deflate', 'connection': 'keep-alive', 'keep-alive': '300', 'accept': 'text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,image/jpeg,image/gif;q=0.2,*/*;q=0.1', 'user-agent': 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113', 'accept-charset': 'ISO-8859-1,utf-8;q=0.7,*;q=0.7', 'host': 'pagead2.googlesyndication.com', 'referer': 'http://www.ethereal.com/download.html'}, version='1.1', data='', method='GET')

Post a Comment for "Parsing Pcap Files With Dpkt (python)"