|
|
| |
IPAGGCREATE(1) |
|
IPAGGCREATE(1) |
ipaggcreate - produce aggregate statistics of network traffic or trace
ipaggcreate [-r | -i | --netflow-summary] [--src,
--dst, --sport, --dport, ...] [other options]
[files or interfaces]
The ipaggcreate program reads IP packets from one or more data sources,
maps each packet to a label (such as "source address 192.4.10.9" or
"length 10"), and outputs a simply-formatted "aggregate"
file reporting the number of packets or bytes observed per label. The
resulting file is easy to process with text-based tools. (But see the
--binary option, which generates a compressed, quick-to-process binary
file.)
Here are a couple lines of ipaggcreate output, from
`ipaggcreate -s /home/kohler/largedump.gz':
!IPAggregate 1.0
!creator "src/ipaggcreate -s /home/kohler/largedump.gz"
!counts packets
!times 976937726.638704 977337361.804592 399635.165888
!num_nonzero 1437
!ip
4.2.49.2 1
4.2.49.4 1
4.17.143.9 1
4.21.203.29 104
The `-s' option, which is equivalent to `--src',
tells ipaggcreate to categorize each packet by its source IP
address. `/home/kohler/largedump.gz' is a compressed
tcpdump(1) file. Each data line represents a label; the first field
is the label number (here, an IP source address), and the second field the
number of packets that had that label. Labels with 0 counts are not
reported.
Data source options tell ipaggcreate what kind of data source to use:
tcpdump(1) raw-packet files (--tcpdump), live network interfaces
(--interface), NetFlow summary files (--netflow-summary),
ipsumdump output files (--ipsumdump), DAG or NLANR-formatted
files (--dag, --nlanr), or others.
Non-option arguments specify the files, or interfaces, to read.
For example, `ipaggcreate -r eth0 eth1' will read two
tcpdump(1) files, named "eth0" and "eth1";
`ipaggcreate -i eth0 eth1' will read from two live network
interfaces, "eth0" and "eth1".
Options that read files read from the standard input when you
supply a single dash "-" as a filename, or
when you give no filenames at all.
- --tcpdump, -r
- Read from one or more files produced by tcpdump(1)'s -w
option (also known as "pcap files"). Stop when all the files are
exhausted. This is the default. Files (except for standard input) may be
compressed by gzip(1) or bzip2(1); ipsumdump will
uncompress them on the fly.
- --interface, -i
- Read from live network interfaces. When run this way, ipsumdump
will continue until interrupted with SIGINT or SIGHUP. When stopped,
ipsumdump appends a comment to its output file, indicating how many
packets were dropped by the kernel before output.
- --ipsumdump
- Read from one or more ipsumdump files. Any packet characteristics
not specified by the input files are set to 0.
- --format=format
- Read from one or more ipsumdump files, using the specified default
format. The format should be a space-separated list of content
types; see ToIPSummaryDump(n) for a list.
- --dag[=encap]
- Read from one or more DAG-formatted trace files. For new-style ERF dumps,
which contain encapsulation type information, just say --dag. For
old-style dumps, you must supply the right encap argument:
"ATM" for ATM RFC-1483 encapsulation
(the most common), "ETHER" for Ethernet,
"PPP" for PPP,
"IP" for raw IP,
"HDLC" for Cisco HDLC,
"PPP_HDLC" for PPP HDLC, or
"SUNATM" for Sun ATM. See
<http://dag.cs.waikato.ac.nz/>.
- --nlanr
- Read from one or more NLANR-formatted trace files (fr, fr+, or tsh
format). See <http://pma.nlanr.net/Traces/>.
- --ip-addresses
- Read files containing IP addresses, one address per line. The label must
be either --src or --dst.
- --tu-summary
- Read TCP/UDP summary files. Each line represents one packet, and carries
the following information: timestamp, source address, source port,
destination address, destination port, protocol, payload length. For
example:
976937735.345744 18.26.4.9 22 64.55.139.202 26876 T 0
976937770.197008 128.10.5.110 63749 64.55.139.202 113 T 5
- --bro-conn-summary
- Read Bro connection summary files. Each line represents one connection
attempt, and carries the following information: timestamp, source address,
destination address, direction (inbound/outbound).
- --netflow-summary
- Read from one or more NetFlow summary files. These are line-oriented ASCII
files; blank lines, and lines starting with '!' or '#', are ignored. Other
lines should contain 15 or more fields separated by vertical bars '|'.
Ipsumdump pays attention to some of these fields:
Field Meaning Example
----- ---------------------------- ----------
0 Source IP address 192.4.1.32
1 Destination IP address 18.26.4.44
5 Packet count in flow 5
6 Byte count in flow 10932
7 Flow timestamp (UNIX-style) 998006995
8 Flow end timestamp 998006999
9 Source port 3917
10 Destination port 80
12 TCP flags (OR of all pkts) 18
13 IP protocol 6
14 IP TOS bits 0
- --tcpdump-text
- Read from one or more files containing tcpdump(1) textual output.
It's much better to use the binary files produced by 'tcpdump -w',
but if someone threw those away and all you have is the ASCII output, you
can still make do. Only works with tcpdump versions 3.7 and earlier.
These options determine how packets are labeled; you can supply at most one.
- --src, -s
- Label by IP source address; all packets with the same source address form
an aggregate.
- --dst, -d
- Label by IP destination address. This is the default.
- --length, -l
- Label by IP length.
- --ip field
- Label by the named IP field. Examples include "ip
src" (equivalent to --src), "ip
ttl", "ip off",
"udp sport", and so forth. See
AggregateIP(1) for a full list.
- --flows
- Label by TCP or UDP flow, or, essentially, by end-to-end transport-level
connection. Two packets have the same label if and only if they are part
of the same TCP or UDP connection. Each flow is assigned its own label.
The label number is not meaningful; non-TCP/UDP packets are ignored.
- --unidirectional-flows
- Label by unidirectional TCP or UDP flow. Like --flows, but packets
from a single connection but heading in different directions are assigned
different labels.
- --address-pairs
- Label by address pair. Two packets have the same label if and only if they
involve the same pair of IP addresses. The label number is not
meaningful.
- --unidirectional-address-pairs
- Label by unidirectional address pair. Two packets have the same label if
and only if their source addresses match and their destination address
match.
These options specify whether ipaggcreate should count packets or bytes.
- --packets
- Count packets: the output file will report the number of packets per
label. This is the default.
- --bytes, -B
- Count bytes: the output file will report the number of bytes per label.
This number includes IP and transport headers, but not any link
headers.
These options select portions of the trace file, and allow the user to split
trace data into multiple aggregate files.
- --time-offset=time, -T time
- Ignore the first time worth of packets in the input trace. If the
first packet has timestamp T, then all packets (including the first) with
timestamp less than T+time are ignored. The time argument
can be an absolute number of seconds (938.42), or
use suffixes such as "100s",
"12ms",
"1.5min",
"2hr", and so forth.
- --start-time=time
- Ignore packets with timestamps less than time.
- --interval=time, -t time
- Stop after recording aggregate information for time worth of
packets. That is, if the first recorded packet has timestamp T, then
ipaggcreate will exit just before the first packet with timestamp
T+time, or the end of the trace, whichever comes first.
- --limit-labels=count
- Stop after recording information for count distinct labels. That
is, exit just before encountering a packet with the count+1
different label, or at the end of the trace, whichever comes first.
The four --split options generate multiple aggregate output
files based on characteristics of the input. To use --split, you must
supply an explicit --output filename containing a
"%d"-style template; a file number is
plugged in to that template. For example, the template
"file%03d.txt" will generate files
"file001.txt",
"file002.txt", and so forth.
- --split-time=time
- Start a new output file every time period. That is, each file will
contain data for at most time worth of packets.
- --split-labels=count
- Start a new output file every count distinct labels. That is, each
file will contain at most count different labels.
- --split-packets=count
- Start a new output file every count packets.
- --split-bytes=count
- Start a new output file every count bytes.
- --output=file, -o file
- Write the summary dump to file instead of to the standard
output.
- --binary, -b
- Write the summary dump in binary format. See below for more
information.
- --write-tcpdump=file, -w file
- Write processed packets to a tcpdump(1) file -- or to the
standard output, if file is a single dash
"-" -- in addition to the usual summary
output.
- --filter=filter, -f filter
- Only include packets and flows matching a tcpdump(1) filter. For
example, `ipsumdump -f "tcp && src net 18/8"'
will summarize data only for TCP packets from net 18. (The syntax for
filter is currently a subset of tcpdump's syntax.)
- --anonymize, -A
- Anonymize IP addresses in the output. The anonymization preserves prefix
and class. This means, first, that two anonymized addresses will share the
same prefix when their non-anonymized counterparts share the same prefix;
and second, that anonymized addresses will be in the same class (A, B, C,
or D) as their non-anonymized counterparts. The anonymization algorithm
comes from tcpdpriv(1); it works like `tcpdpriv -A50 -C4'.
If --anonymize and --write-tcpdump are both on,
the tcpdump output file will have anonymized IP addresses.
However, the file will contain actual packet data, unlike
tcpdpriv output.
- --no-promiscuous
- Do not place interfaces into promiscuous mode. Promiscuous mode is the
default.
- --sample=p
- Sample packets with probability p. That is, p is the chance
that a packet will cause output to be generated. The actual probability
may differ from the specified probability, due to fixed point arithmetic;
check the output for a
`"!sampling_prob"' comment to see the
real probability. Strictly speaking, this option samples records, not
packets, so for NetFlow summaries without --multipacket, it will
sample flows.
- --multipacket
- Supply this option if you are reading NetFlow or IP summaries -- files
where each record might represent multiple packets -- and you would like
the output summary to have one line per packet, instead of the default one
line per record. See also --packet-count, above.
- --collate
- Sort output packets by increasing timestamp. Use this option when reading
from multiple tcpdump(1) files to ensure that the output has sorted
timestamps. Combine --collate with --write-tcpdump to
collate overlapping tcpdump(1) files into a single, sorted
tcpdump(1) file.
- --random-seed=seed
- Set the random seed deterministically to seed, an unsigned integer.
By default, the random seed is initialized to a random value using
/dev/random, if it exists, combined with other data. The random
seed indirectly determines which packets are sampled, and the values of
anonymized IP addresses.
- --quiet, -q
- Do not print a progress bar to standard error. This is the default when
ipsumdump isn't running interactively.
- --config
- Do not produce a summary. Instead, write the Click configuration that
ipsumdump would run to the standard output.
- --verbose, -V
- Produce more verbose error messages.
- --help, -h
- Print a help message to the standard output, then exit.
- --version, -v
- Print version number and license information to the standard output, then
exit.
When killed with SIGTERM or SIGINT, ipaggcreate will exit cleanly (and
generate an output file). If you want it to flush its buffers without exiting,
kill it with SIGHUP.
Binary ipaggcreate files begin with several ASCII lines, just like regular
ipaggcreate files. A line `"!packed_be"' or
`"!packed_le"' indicates that the rest of
the file, starting immediately after the newline, consists of binary records
(in big-endian or little-endian order, respectively). Each record is 8 bytes
long, and looks like this:
+---------------+---------------+
| label | count |
+---------------+---------------+
<---4 bytes---> <---4 bytes--->
The initial word of data contains the label number, the second the
count.
The ipaggcreate program uses the Click modular router, an extensible
system for processing packets. Click routers consist of C++ components called
elements. While some elements run only in a Linux kernel, most can run either
in the kernel or in user space, and there are user-level elements for reading
packets from libpcap or from tcpdump files.
Ipaggcreate creates and runs a user-level Click
configuration. However, you don't need to install Click to run
ipsumdump; the libclick directory contains all the relevant
parts of Click, bundled into a library.
If you're curious, try running `ipaggcreate --config' with
some other options to see the Click configuration ipsumdump would
run.
This is, I think, a pleasant way to write a packet processor!
tcpdump(1), tcpdpriv(1), click(1), ipsumdump(1)
See http://www.pdos.csail.mit.edu/click/ for more on Click.
Eddie Kohler <kohler@cs.ucla.edu>, based on the Click modular router.
Anonymization algorithm from tcpdpriv(1) by Greg
Minshall.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |