How Intrusive Is Tcpdump

How to convert KDD 99 dataset to tcpdump format?

From the KDD99 homepage:

The 1998 DARPA Intrusion Detection
Evaluation Program was prepared and
managed by MIT Lincoln Labs. ... The
1999 KDD intrusion detection contest
uses a version of this dataset.

Being somewhat familiar with the original DARPA dataset and with the information contained in a PCAP network capture file, I can tell you that the KDD99 data files contain nowhere near enough information to reconstruct a proper network capture file.

It seems that KDD99 is a boiled-down version of the DARPA IDEVAL98 data set, where only high-level operations, such as connections, are retained, instead of individual packets. If you need the actual network capture files, you should probably get the original DARPA IDEVAL data sets.

View - but not intercept - all IPv4 traffic to Linux computer

Yes, you can see all the packets that arrive at your network interface. There are several options to access or view them. Here a small list of possible solutions, where the first one is the easiest and the last one the hardest to utilize:

Wireshark

I'd say this is pretty much the standard when it comes to protocol analyzers with a GUI (uses libpcap). It has tons of options, a nice GUI, great filtering capabilities and reassembles IP datagrams. It uses libpcap and can also show the raw ethernet frame data. For example it allows you to see layer 2 packets like ARP. Furthermore you can capture the complete data arriving at your network interface in a file that can later be analyzed (also in Wireshark).

tcpdump

Very powerful, similar features like Wireshark but a command line utility, which also uses libpcap. Can also capture/dump the complete interface traffic to a file. You can view the dumped data in Wireshark since the format is compatible.

ngrep

This is known as the "network grep" and is similar to tcpdump but supports regular expressions (regex) to filter the payload data. It allows to save captured data in the file format supported by Wireshark and tcpdump (also uses libpcap).

libnids

Quotation from the official git repository:

"Libnids is a library that provides a functionality of one of NIDS
(Network Intrusion Detection System) components, namely E-component. It means
that libnids code watches all local network traffic [...] and provides convenient information on them to
analyzing modules of NIDS. Libnids performs:

  • assembly of TCP segments into TCP streams
  • IP defragmentation
  • TCP port scan detection"

libpcap

Of course you can also write your own programs by using the library directly. Needless to say, this requires more efforts.

Raw or Packet Sockets

In case you want to do all the dirty work yourself, this is the low level option, which of course also allows you to do everything you want. The tools listed above use them as a common basis. Raw sockets operate on OSI layer 3 and packet sockets on layer 2.


Note: This is not meant to be a complete list of available tools or options. I'm sure there are much more but these are the most common ones I can think of.

How to derive KDD99 Features from DARPA pcap file?

Be careful with this data set.

http://www.kdnuggets.com/news/2007/n18/4i.html

Some excerpts:

the artificial data was generated using a closed network, some proprietary network traffic generators, and hand-injected attacks

Among the issues raised, the most important seemed to be that no validation was ever performed to show that the DARPA dataset actually looked like real network traffic.

In 2003, Mahoney and Chan built a trivial intrusion detection system and ran it against the DARPA tcpdump data. They found numerous irregularities, including that -- due to the way the data was generated -- all the malicious packets had a TTL of 126 or 253 whereas almost all the benign packets had a TTL of 127 or 254.

the DARPA dataset (and by extension, the KDD Cup '99 dataset) was fundamentally broken, and one could not draw any conclusions from any experiments run using them

we strongly recommend that (1) all researchers stop using the KDD Cup '99 dataset

As for the feature extraction used. IIRC the majority of features simply were attributes of the parsed IP/TCP/UDP headers. Such as, port number, last octet of IP, and some packet flags.

As such, these findings no longer reflect realistic attacks anymore anyway. Todays TCP/IP stacks are much more robust than at the time the data set was created, where a "ping of death" would instantly lock up a windows host. Every developer of a TCP/IP stack should by now be aware of the risk of such malformed packets and stress-test the stack against such things.

With this, these features have become pretty much meaningless. Incorrectly set SYN flags etc. are no longer used in network attacks; these are much more sophisticated; and most likely no longer attacking the TCP/IP stack, but the services running on the next layer. So I would not bother finding out which low level packet flags were used in that '99 flawed simulation using attacks that worked in the early '90s...



Related Topics



Leave a reply



Submit