Packet Capture, part 3: Analysis Tools
Network Troubleshooting Tools
by Joseph D. Sloan
1.5. Analysis Tools
As previously noted, one reason for using tcpdump is the wide variety of support tools that are available for use with tcpdump or files created with tcpdump. There are tools for sanitizing the data, tools for reformatting the data, and tools for presenting and analyzing the data.
If you are particularly sensitive to privacy or security concerns, you may want to consider sanitize, a collection of five Bourne shell scripts that reduce or condense tcpdump trace files and eliminate confidential information. The scripts renumber host entries and select classes of packets, eliminating all others. This has two primary uses. First, it reduces the size of the files you must deal with, hopefully focusing your attention on a subset of the original traffic that still contains the traffic of interest. Second, it gives you data that can be distributed or made public (for debugging or network analysis) without compromising individual privacy or revealing too much specific information about your network. Clearly, these scripts won't be useful for everyone. But if internal policies constrain what you can reveal, these scripts are worth looking into.
The five scripts included in sanitize are sanitize-tcp, sanitize-syn-fin, sanitize-udp, sanitize-encap, and sanitize-other. Each script filters out inappropriate traffic and reduces the remaining traffic. For example, all non-TCP packets are removed by sanitize-tcp and the remaining TCP traffic is reduced to six fields -- an unformatted timestamp, a renumbered source address, a renumbered destination address, the source port, a destination address, and the number of data bytes in the packet.
934303014.772066 126.96.36.199.1174 > 188.8.131.52.23: . ack 3259091394 win 8647 (DF) 4500 0028 b30c 4000 8006 2d84 cd99 3f1e cd99 3fee 0496 0017 00ff f9b3 c241 c9c2 5010 21c7 e869 0000 0000 0000 0000
would be reduced to 934303014.772066 1 2 1174 23 0. Notice that the IP numbers have been replaced with 1 and 2, respectively. This will be done in a consistent manner with multiple packets so you will still be able to compare addresses within a single trace. The actual data reported varies from script to script. Here is an example of the syntax:
bsd1# sanitize-tcp tracefile
This runs sanitize-tcp over the tcpdump trace file tracefile. There are no arguments.
The program tcpdpriv is another program for removing sensitive information from tcpdump files. There are several major differences between tcpdpriv and sanitize. First, as a shell script, sanitize should run on almost any Unix system. As a compiled program, this is not true of tcpdpriv. On the other hand, tcpdpriv supports the direct capture of data as well as the analysis of existing files. The captured packets are written as a tcpdump file, which can be subsequently processed.
Also, tcpdpriv allows you some degree of control over how much of the original data is removed or scrambled. For example, it is possible to have an IP address scrambled but retain its class designation. If the -C4 option is chosen, an IP address such as 184.108.40.206 might be replaced with 220.127.116.11. Notice that address classes are preserved -- a class C address is replaced with a class C address.
There are a variety of command-line options that control how data is rewritten, several of which are mandatory. Many of the command-line options will look familiar to tcpdump users. The program does not allow output to be written to a terminal, so it must be written directly to a file or redirected. While a useful program, the number of required command-line options can be annoying. There is some concern that if the options are not selected properly, it may be possible to reconstruct the original data from the scrambled data. In practice, this should be a minor concern.
As an example of using tcpdpriv, the following command will scramble the file tracefile:
bsd1# tcpdpriv -P99 -C4 -M20 -r tracefile -w outfile
The -P99 option preserves (doesn't scramble) the port numbers, -C4 preserves the class identity of the IP addresses, and -M20 preserves multicast addresses. If you want the data output to your terminal, you can pipe the output to tcpdump:
bsd1# tcpdpriv -P99 -C4 -M20 -r tracefile -w- | tcpdump -r-
The last options look a little strange, but they will work.
Another useful tool is tcpflow, written by Jeremy Elson. This program allows you to capture individual TCP flows or sessions. If the traffic you are looking at includes, say, three different Telnet sessions, tcpflow will separate the traffic into three different files so you can examine each individually. The program can reconstruct data streams regardless of out-of-order packets or retransmissions but does not understand fragmentation.
tcpflow stores each flow in a separate file with names based on the source and destination addresses and ports. For example, SSH traffic (port 22) between 172.16.2.210 and 18.104.22.168 might have the filename 172.016.002.210.00022-205.153.063.030.01071, where 1071 is the ephemeral port created for the session.
Since tcpflow uses libpcap, the same packet capture library tcpdump uses, capture filters are constructed in exactly the same way and with the same syntax. It can be used in a number of ways. For example, you could see what cookies are being sent during an HTTP session. Or you might use it to see if SSH is really encrypting your data. Of course, you could also use it to capture passwords or read email, so be sure to set permissions correctly.
The program tcp-reduce invokes a collection of shell scripts to reduce the packet capture information in a tcpdump trace file to one-line summaries for each connection. That is, an entire Telnet session would be summarized by a single line. This could be extremely useful in getting an overall picture of how the traffic over a link breaks down or for looking quickly at very large files.
The syntax is quite simple.
bsd1# tcp-reduce tracefile > outfile
will reduce tracefile, putting the output in outfile. The program tcp-summary, which comes with tcp-reduce, will further summarize the results. For example, on my system I traced a system briefly with tcpdump. This process collected 741 packets. When processed with tcp-reduce, this revealed 58 TCP connections. Here is an example when results were passed to tcp-summary :
bsd1# tcp-reduce out-file | tcp-summary
This example produced the following five-line summary:
proto # conn KBytes % SF % loc % ngh ----- ------ ------ ---- ----- ----- www 56 35 25 0 0 telnet 1 1 100 0 0 pop-3 1 0 100 0 0
In this instance, this clearly shows that the HTTP traffic dominated the local network traffic.
The program tcpshow decodes a tcpdump trace file. It represents an alternative to using tcpdump to decode data. The primary advantage of tcpshow is much nicer formatting for output. For example, here is the tcpdump output for a packet:
12:36:54.772066 sloan.lander.edu.1174 > 22.214.171.124.telnet: . ack 3259091394 win 8647 (DF) b
Here is corresponding output from tcpshow for the same packet:
----------------------------------------------------------------------- Packet 1 TIME: 12:36:54.772066 LINK: 00:10:5A:A1:E9:08 -> 00:10:5A:E3:37:0C type=IP IP: sloan -> 126.96.36.199 hlen=20 TOS=00 dgramlen=40 id=B30C MF/DF=0/1 frag=0 TTL=128 proto=TCP cksum=2D84 TCP: port 1174 -> telnet seq=0016775603 ack=3259091394 hlen=20 (data=0) UAPRSF=010000 wnd=8647 cksum=E869 urg=0 DATA: <No data> -----------------------------------------------------------------------
The syntax is:
bsd1# tcpshow < trace-file
There are numerous options.
The program tcpslice is a simple but useful program for extracting pieces or merging tcpdump files. This is a useful utility for managing larger tcpdump files. You specify a starting time and optionally an ending time for a file, and it extracts the corresponding records from the source file. If multiple files are specified, it extracts packets from the first file and then continues extracting only those packets from the next file that have a later timestamp. This prevents duplicate packets if you have overlapping trace files.
While there are a few options, the basic syntax is quite simple. For example, consider the command:
bsd1# tcpslice 934224220.0000 in-file > out-file
This will extract all packets with timestamps after 934224220.0000. Note the use of an unformatted timestamp. This is the same format displayed with the -tt option with tcpdump. Note also the use of redirection. Because it works with binary files, tcpslice will not allow you to send output to your terminal. See the manpage for additional options.
This program is an extremely powerful tcpdump file analysis tool. The program tcptrace is strictly an analysis tool, not a capture program, but it works with a variety of capture file formats. The tool's primary focus is the analysis of TCP connections. As such, it is more of a network management tool than a packet analysis tool. The program provides several levels of output or analysis ranging from very brief to very detailed.
While for most purposes tcptrace is used as a command-line tool, tcptrace is capable of producing several types of output files for plotting with the X Window program xplot. These include time sequence graphs, throughput graphs, and graphs of round-trip times. Time sequence graphs (-S option) are plots of sequence numbers over time that give a picture of the activity on the network. Throughput graphs (-T option), as the name implies, plot throughput in bytes per second against time. While throughput gives a picture of the volume of traffic on the network, round-trip times give a better picture of the delays seen by individual connections. Round-trip time plots (-R option) display individual round-trip times over time. For other graphs and graphing options, consult the documentation.
For normal text-based operations, there are an overwhelming number of options and possibilities. One of the most useful is the -l option. This produces a long listing of summary statistics on a connection-by-connection basis. What follows is an example of the information provided for a single brief Telnet connection:
TCP connection 2: host c: sloan.lander.edu:1230 host d: 188.8.131.52:23 complete conn: yes first packet: Wed Aug 11 11:23:25.151274 1999 last packet: Wed Aug 11 11:23:53.638124 1999 elapsed time: 0:00:28.486850 total packets: 160 filename: telnet.trace c->d: d->c: total packets: 96 total packets: 64 ack pkts sent: 95 ack pkts sent: 64 pure acks sent: 39 pure acks sent: 10 unique bytes sent: 119 unique bytes sent: 1197 actual data pkts: 55 actual data pkts: 52 actual data bytes: 119 actual data bytes: 1197 rexmt data pkts: 0 rexmt data pkts: 0 rexmt data bytes: 0 rexmt data bytes: 0 outoforder pkts: 0 outoforder pkts: 0 pushed data pkts: 55 pushed data pkts: 52 SYN/FIN pkts sent: 1/1 SYN/FIN pkts sent: 1/1 mss requested: 1460 bytes mss requested: 1460 bytes max segm size: 15 bytes max segm size: 959 bytes min segm size: 1 bytes min segm size: 1 bytes avg segm size: 2 bytes avg segm size: 23 bytes max win adv: 8760 bytes max win adv: 17520 bytes min win adv: 7563 bytes min win adv: 17505 bytes zero win adv: 0 times zero win adv: 0 times avg win adv: 7953 bytes avg win adv: 17519 bytes initial window: 15 bytes initial window: 3 bytes initial window: 1 pkts initial window: 1 pkts ttl stream length: 119 bytes ttl stream length: 1197 bytes missed data: 0 bytes missed data: 0 bytes truncated data: 1 bytes truncated data: 1013 bytes truncated packets: 1 pkts truncated packets: 7 pkts data xmit time: 28.479 secs data xmit time: 27.446 secs idletime max: 6508.6 ms idletime max: 6709.0 ms throughput: 4 Bps throughput: 42 Bps
This was produced by using tcpdump to capture all traffic into the file telnet.trace and then executing tcptrace to process the data. Here is the syntax required to produce this output:
bsd1# tcptrace -l telnet.trace
Similar output is produced for each TCP connection recorded in the trace file. Obviously, a protocol (like HTTP) that uses many different sessions may overwhelm you with output.
There is a lot more to this program than covered in this brief discussion. If your primary goal is analysis of network performance and related problems rather than individual packet analysis, this is a very useful tool.
The program trafshow is a packet capture program of a different sort. It provides a continuous display of traffic over the network, giving repeated snapshots of traffic. It displays the source address, destination address, protocol, and number of bytes. This program would be most useful in looking for suspicious traffic or just getting a general idea of network traffic.
While trafshow can be run on a text-based terminal, it effectively takes over the display. It is best used in a separate window of a windowing system. There are a number of options, including support for packet filtering using the same filter format as tcpdump.
The xplot program is an X Windows plotting program. While it is a general purpose plotting program, it was written as part of a thesis project for TCP analysis by David Clark. As a result, some support for plotting TCP data (oriented toward network analysis) is included with the package. It is also used by tcptrace. While a powerful and useful program, it is not for the faint of heart. Due to the lack of documentation, the program is easiest to use with tcptrace rather than as a standalone program.
1.5.10. Other Packet Capture Programs
We have discussed tcpdump in detail because it is the most widely available packet capture program for Unix. Many implementations of Unix have proprietary packet capture programs that are comparable to tcpdump. For example, Sun Microsystems' Solaris provides snoop. (This is a replacement for etherfind, which was supplied with earlier versions of the Sun operating system.)
Here is an example of using snoop to capture five packets:
sol1> snoop -c5 Using device /dev/elxl (promiscuous mode) 172.16.2.210 -> sol1 TELNET C port=28863 sol1 -> 172.16.2.210 TELNET R port=28863 /dev/elxl (promiscuo 172.16.2.210 -> sol1 TELNET C port=28863 172.16.2.210 -> sloan.lander.edu TCP D=1071 S=22 Ack=143990 Seq=3737542069 Len=60 Win=17520 sloan.lander.edu -> 172.16.2.210 TCP D=22 S=1071 Ack=3737542129 Seq=143990 Len=0 Win=7908 snoop: 5 packets captured
As you can see, it is used pretty much the same way as tcpdump. (Actually, the output has a slightly more readable format.) snoop, like tcpdump, supports a wide range of options and filters. You should have no trouble learning snoop if you have ever used tcpdump.
The next segment from Network Troubleshooting Tools will cover packet analyzers.