Can Libpcap Reassemble Tcp Segments

Can libpcap reassemble TCP segments

Packet or stream reassembly is not mentioned in pcap(3).

If I remember correctly, the dsniff tools use libnids to reassemble IP packets and TCP streams.

Receiving TCP segments bigger than MTU with libpcap

However, I'm having some issues with libpcap. When I receive the packets with TCP segments contaning the HTTP data, some of them are bigger than MTU! Like 1922, 2878 and even 4909 bytes.

Your network adapter may be acting as a TCP offload engine, reassembling multiple incoming TCP segments and handing one reassembled segment to the host. At least on Linux, the networking stack might be performing Large Receive Offload, and if that's done before handing packets to "taps" (the PF_PACKET sockets used by libpcap on Linux), you'd get the reassembled segments.

For your program, this shouldn't be an issue, given that...

Do I'll really need to reassemble by myself all these HTTP data?

...you will need to reassemble all the components of an HTTP request or reply yourself.

Packet reassembly at Network Layer libpcap

The IP header only gives you the size of the fragment. So you need to reserve a buffer the size of the largest possible IP packet, i.e. 65535 bytes. Only once you get the last fragment can you determine the length of the complete packet.

Reconstructing data from PCAP sniff

It's really pretty simple. Just take the ethernet frames that you get from pcap and extract the IP packets from them, reassembling any that were fragmented. Then, reorder the TCP segments from the IP packets, according to the sequence numbers, paying attention that you discard any duplicate data. Then, process the stream as an HTTP stream. Of course, HTTP doesn't come in packets; it is an application layer protocol, but I'm sure this will be obvious once you've done all this other work. Pay attention as you do all these things to checksum the IP headers and TCP segments, to ensure that your data is correct. Also, if pcap happens to miss any packets, make sure you deal with this appropriately.

To help you along the Linux TCP stack should provide a concise reference to this process as it occurs in the kernel.

Netty : does it need to care TCP segments reassembly?

the packet could be divided into multiple segments

Upside down, or bad terminology. TCP sends segments which are divided into packets and which may be further split into sub-packets en route.

My understanding is that the TCP layer is where divided segments get reassembled.

Packet reassembly takes places in the IP layer, not the application (or the TCP layer). Segment reassembly takes place in the TCP layer.

"messageReceived()" method gets called only one or 3 times?

It gets called any number of times from 1 to N where N is the length of the byte stream. There is no guaranteed 1::1 correspondence between sender sends and receiver receives.

If the server fails to reassemble the segments because one of them is lost or something,
then the TCP layer pass the incomplete packet to the application layer?

Absolutely not. TCP doesn't pass packets to the application layer at all. It passes an intact, correctly sequenced byte stream, or nothing.

Wondering if i should handle the segment reassembly by myself

You don't, and can't, handle any of it yourself. TCP provides a byte stream to the application, not segments or packets.

Good library for TCP reassembly

Have you tried using the bro-ids http://bro-ids.org/ TCP layer written in C++? You are able to write simple application layer bro-scripts to do whatever you need to do. Otherwise, you can take a look on their src code.

reassembly of tcp packet

You can't. TCP/IP is conceptually a stream, not a sequence of messages (the fact that it is ultimately implemented as a sequence of packets is irrelevant). When you write a sequence of bytes to a TCP/IP stream, that sequence is added to the stream; it is not treated as a message which should maintain its own identity. No notion of message begin/end is transmitted along with the stream, unless you do so yourself in your own protocol.

If you find this hard to believe, consider how it works for files: if you write a sequence of bytes to a file, that sequence does not somehow become a record that you can later identify and retrieve. If you want that kind of structure you have to add it yourself. The same is true for TCP/IP.

The transport packets used to implement TCP/IP have no relation to the data blocks you specify with your API calls; they are merely a way to implement the TCP/IP stream. For some use cases there may appear to be a mapping, but this is accidental.

The only way to split a TCP/IP stream back into separate messages is by using knowledge of the protocol running on top of TCP/IP. In your case this is FIX. I assume you know how that works; you can use that knowledge to correctly split the FIX data back into its original messages. A generic TCP/IP message splitter cannot be made.

Can Libpcap Reassemble Tcp Segments