Understanding Tcpdump filter & bit-masking
It's not the BPF filter that gets http headers but the "-A" switch on your tcpdump command.
Your tcpdump command looks for tcp traffic to certain destination or from a certain source on eth0 where the final BPF filter involves a calculation that results in a non-zero total. With the "-A" option, it prints each packet in ASCII minus its link level header.
I've explained the calculation below but I believe there's some issues in the actual filter, possibly through copying and pasting. When you use these filters in tcpdump, you're using tcp bit-masking, which is typically used when examining fields that do not fall on byte boundaries
ip[2:2]
refers to the two bytes (i.e. 3rd & 4th bytes) in the IP header, beginning at byte 2 (remember it starts at offset 0). This total represents the total length of the IP packet which can be a maximum of 65535 bytes.
For the bitmask here, for clarity, I've pre-pended a '0' so mask 0xf
becomes 0x0f
. The leading '0' on the mask is dropped as per the comment from GuyHarris below.
ip[0]&0x0f
refers to the second half of byte 0 (i.e. the 1st byte) in the IP header, which will give you the IP header length in 32 bit words and as such, this is typically multiplied by 4 for such a calculation.tcp[12]&0xf0)
refers to the first half of byte 12 (i.e. the 11th byte), which is the data offset field, which specifies the size of the TCP header in 32-bit words and as such, this is typically multiplied by 4 for such a calculation.
You need to multiply the last 2 lengths by 4 because they are 32 bit/4 byte words and so need be translated to a total in bytes for the calculation to be correct
Your filter should be calculating:
- The IP packet length (in bytes) - The IP header length - The TCP Header Length
and looking for that value to be zero, i.e. something like this
sudo tcpdump -A -nnpi eth0 '(ip[2:2] - ((ip[0]&0x0f)*4) - ((tcp[12]&0xf0)*4) != 0)'
When you perform the subtraction, you're looking for a non-zero total. This non-zero total means that there's data above layer 4, i.e. data in the tcp payload, typically application traffic.
You may also want to add port 80
assuming most http traffic is over port 80.
Such a filter is commonly used by security folk to detect data on a SYN, which is not normal but according to the RFCs, it is allowed. so the whole thing would look something like -
'tcp[13]=0x02 and (ip[2:2] - ((ip[0]&0x0f)*4) - ((tcp[12]&0xf0)*4) != 0)'
TCPIPGuide is a very good, free online guide on TCP/IP btw.
Updated: Modify the 'leading zero' section on the bitmask as per the update from Guy Harris.
TCPDUM Bit Masking
Your second one isn't working because you are masking off the low nibble of offset 12 and preserving the high nibble... which is correct.. but you aren't actually capturing its value.
Effectively, you have said this:
(tcp[12] & 0xf0 != 0)
That will produce a 1 or a zero as a true or a false. Next, you multiply that by 4... which will always work since the TCP header length will always be greater than zero... but it will now be looking for the "GE" letters at offset 4 in the TCP header... the start of the sequence number.
You can still use the 0xf0
mask, but you still need to divide it or shift it. For example:
(tcp[12] & 0xf0 >> 2)
Notice that I am taking advantage of the shift to avoid having to multiply by 4... Multiplying by 4 is equivalent to shifting left 2 bits. Since I would normally shift the 12th byte offset 4 bits, I'm saving a step.
TCPDUMP: Bitmasking
This states to set all bits in the first byte of the IP packet header except for the first 4 bits (which is the version number) to 0
More correctly, it selects the first 4 bits of the first byte of the IP packet header, and returns a value in which the lower 4 bits are zero.
So you are correct, in that tcpdump IP[0] & 0xf0 = 4
will NEVER succeed (as IP[0] & 0xf0
is in the range 0x00
through 0xf0
, with the low-order nibble being 0, so it can NEVER equal 4), and IP[0] & 0xf0 = 0x40
will succeed only if the IP version number in the IP header is 4 (rather than, for example, 6).
Difference between two similar tcpdump filters
With that syntax you can filter the packets bitwise.
For example, consider the first two bytes of an IP frame.
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| IHL |Type of Service| Total Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Let's say you want to filter only ip packets with version equal to 4 (indicating IPv4 packets).
You can do something like this
tcpdump -i ethX 'ip[0:1] & 0xf0 = 0x40'
- ip[0:1] means "extract 1 bytes from offset zero of the IP frame"
- & 0xf0 filters out the IHL bits off the first byte
- = 0x40 will match only if the version bits contains the number 4
et voilà, you built a custom filter digging deeply into the captured frames.
In the two cases you listed, i suppose there's a typo.
I think it should be:
proto[x:y] & z = n : every bits are set to n when applying mask z to proto[x:y]
proto[x:y] = n : p[x:y] has exactly the bits set to n
Could someone explain these code snippets?
If you look in the definition of got_packet
, you'll see const u_char *packet
. packet
is a pointer to a char (or generally, to a location in memory).
In both cases, a pointer gets casted to a respective struct sniff_ethernet
or struct sniff_tcp
pointer, in the first case without manipulation (it accesses the packet from the start), in the second case by adding some offset, ie. the size of the ethernet header and the size of the ip packet. It accesses the tcp data in the packet.
Related Topics
Is \D Not Supported by Grep's Basic Expressions
Understanding Load Average VS. CPU Usage
How to Append the Output to a File
How to Check for Opencv on Ubuntu 9.10
Readelf VS. Objdump: Why Are Both Needed
How to Track Child Process Using Strace
Remote Linux Server to Remote Linux Server Dir Copy. How
Setting a Gdb Exit Breakpoint Not Working
Recursively Cat All the Files into Single File
What Is a Reasonable Amount of Inotify Watches with Linux
Differencebetween Ld_Library_Path and -L at Link Time
Expression After Last Specific Character
Which Linux Distribution Should I Use as a Xen Host
Docker: How to Extract The Docker Image into Local System
How to Debug a Futex Contention Shown in Strace