|
|
| |
rwaddrcount(1) |
SiLK Tool Suite |
rwaddrcount(1) |
rwaddrcount - count activity by IPv4 address
rwaddrcount {--print-recs | --print-ips | --print-stat}
[--use-dest] [--min-bytes=BYTEMIN] [--max-bytes=BYTEMAX]
[--min-records=RECMIN] [--max-records=RECMAX]
[--min-packets=PACKMIN] [--max-packets=PACKMAX]
[--set-file=PATHNAME] [--sort-ips] [--timestamp-format=FORMAT]
[--ip-format=FORMAT] [--integer-ips] [--zero-pad-ips]
[--no-titles] [--no-columns] [--column-separator=CHAR]
[--no-final-delimiter] [{--delimited | --delimited=CHAR}]
[--print-filenames] [--copy-input=PATH] [--output-path=PATH]
[--pager=PAGER_PROG] [--site-config-file=FILENAME]
[{--legacy-timestamps | --legacy-timestamps=NUM}]
{[--xargs] | [--xargs=FILENAME] | [FILE [FILE ...]]}
rwaddrcount --help
rwaddrcount --version
rwaddrcount reads SiLK Flow records, sums the byte-, packet-, and
record-counts on those records by individual source or destination IP address
and maintains the time window during which that IP address was active. At the
end of the count operation, the results per IP address are displayed when the
--print-recs switch is given. rwaddrcount includes facilities
for displaying only those IP address whose byte-, packet- or flow-counts are
between specified minima and maxima.
rwaddrcount does not support IPv6 addresses. To generate
output for IPv6 records, use the rwuniq(1) tool:
rwuniq --fields=sip --values=bytes,packets,records,stime,etime
rwaddrcount reads SiLK Flow records from the files named on
the command line or from the standard input when no file names are specified
and --xargs is not present. To read the standard input in addition to
the named files, use "-" or
"stdin" as a file name. If an input file
name ends in ".gz", the file is
uncompressed as it is read. When the --xargs switch is provided,
rwaddrcount reads the names of the files to process from the named
text file or from the standard input if no file name argument is provided to
the switch. The input to --xargs must contain one file name per
line.
Option names may be abbreviated if the abbreviation is unique or is an exact
match for an option. A parameter to an option may be specified as
--arg=param or --arg param, though the
first form is required for options that take optional parameters.
For the application to operate, one of the three --print
options must be chosen.
- --print-recs
- Print one row for each bin that meets the minima/maxima criteria. Each bin
contains the IP address, number of bytes, number of packets, number of
flow records, earliest start time, and latest end time.
- --print-ips
- Print a single column containing the IP addresses for each bin that meets
the minima/maxima criteria.
- --print-stat
- Print a one or two line summary (plus a title line) that summarizes the
bins. The first line is a summary across all bins, and it contains
the number of unique IP addresses and the sums of the bytes, packets, and
flow records. The second line is printed only when one or more minima or
maxima are specified. This second line contains the same columns as first,
and its values are the sums across those bins that meet the criteria.
- --use-dest
- Count by destination IP address in the filter record rather than source
IP.
- --min-bytes=BYTEMIN
- Filtering criterion; for the final output (stats or printing), only
include count records where the total number of bytes exceeds BYTEMIN
- --min-packets=PACKMIN
- Filtering criterion; for the final output (stats or printing), only
include count records where the total number of packets exceeds
PACKMIN
- --min-records=RECMIN
- Filtering criterion; for the final output (stats or printing), only
include count records where the total number of filter records
contributing to that count record exceeds RECMIN.
- --max-bytes=BYTEMAX
- Filtering criterion; for the final output (stats or printing), only
include count records where the total number of bytes is less than
BYTEMAX.
- --max-packets=PACKMAX
- Filtering criterion; for the final output (stats or printing), only
include count records where the total number of packets is less than
PACKMAX.
- --max-records=RECMAX
- Filtering criterion; for the final output (stats or printing), only
include count records which at most RECMAX filter records contributed
to.
- --set-file=PATHNAME
- Write the IPs into the rwset(1)-style binary IP-set
file named PATHNAME. Use rwsetcat(1) to see the
contents of this file.
- --timestamp-format=FORMAT
- Specify the format and/or timezone to use when printing timestamps. When
this switch is not specified, the SILK_TIMESTAMP_FORMAT environment
variable is checked for a default format and/or timezone. If it is empty
or contains invalid values, timestamps are printed in the default format,
and the timezone is UTC unless SiLK was compiled with local timezone
support. FORMAT is a comma-separated list of a format and/or a
timezone. The format is one of:
- default
- Print the timestamps as
YYYY/MM/DDThh:mm:ss
- iso
- Print the timestamps as YYYY-MM-DD
hh:mm:ss
- m/d/y
- Print the timestamps as MM/DD/YYYY
hh:mm:ss
- epoch
- Print the timestamps as the number of seconds since 00:00:00 UTC on
1970-01-01.
When a timezone is specified, it is used regardless of the default
timezone support compiled into SiLK. The timezone is one of:
- utc
- Use Coordinated Universal Time to print timestamps.
- local
- Use the TZ environment variable or the local timezone.
- --ip-format=FORMAT
- For the --print-recs and --print-ips output formats, specify
how IP addresses are printed, where FORMAT is a comma-separated
list of the arguments described below. When this switch is not specified,
the SILK_IP_FORMAT environment variable is checked for a value and that
format is used if it is valid. The default FORMAT is
"canonical". Since SiLK
3.7.0.
- canonical
- Print IP addresses in the canonical format: dot-separated decimal for IPv4
(192.0.2.1).
- no-mixed
- Print IP addresses in the canonical format
(192.0.2.1). Prevent use of the mixed IPv4-IPv6
representation when "map-v4" is also
included in FORMAT. For example, use
"::ffff:c000:201" instead of
"::ffff:192.0.2.1". Since SiLK
3.17.0.
- decimal
- Print IP addresses as integers in decimal format. For example, print
192.0.2.1 and
"::ffff:192.0.2.1" as
3221225985 and
281473902969345, respectively.
- hexadecimal
- Print IP addresses as integers in hexadecimal format. For example, print
192.0.2.1 and
"::ffff:192.0.2.1" as
"c00000201" and
"ffffc00000201", respectively.
- zero-padded
- Make all IP address strings contain the same number of characters by
padding numbers with leading zeros. For example, print
192.0.2.1 as
192.000.002.001. For IPv6 addresses, this setting
implies "no-mixed", so that
"::ffff:192.0.2.1" is printed as
"0000:0000:0000:0000:0000:ffff:c000:0201".
As of SiLK 3.17.0, may be combined with any of the above, including
"decimal" and
"hexadecimal".
The following arguments modify certain IP addresses prior to
printing. These arguments may be combined with the above formats.
- map-v4
- Change addresses to IPv4-mapped IPv6 addresses (addresses in the
::ffff:0:0/96 netblock) prior to formatting. Since SiLK
3.17.0.
- unmap-v6
- Do nothing (rwaddrcount does not support IPv6 addresses as the
key). Since SiLK 3.17.0.
The following argument is also available:
- force-ipv6
- Set FORMAT to
"map-v4","no-mixed".
- --integer-ips
- Print IP addresses as integers. This switch is equivalent to
--ip-format=decimal, it is deprecated as of SiLK 3.7.0, and it will
be removed in the SiLK 4.0 release.
- --zero-pad-ips
- Print IP addresses as fully-expanded, zero-padded values in the canonical
format. This switch is equivalent to --ip-format=zero-padded, it is
deprecated as of SiLK 3.7.0, and it will be removed in the SiLK 4.0
release
- --sort-ips
- For the --print-recs and --print-ips output formats, the
results are presented sorted by IP address.
- --no-titles
- Turn off column titles. By default, titles are printed.
- --no-columns
- Disable fixed-width columnar output.
- --column-separator=C
- Use specified character between columns and after the final column. When
this switch is not specified, the default of '|' is used.
- --no-final-delimiter
- Do not print the column separator after the final column. Normally a
delimiter is printed.
- --delimited
- --delimited=C
- Run as if --no-columns --no-final-delimiter
--column-sep=C had been specified. That is, disable
fixed-width columnar output; if character C is provided, it is used
as the delimiter between columns instead of the default '|'.
- --print-filenames
- Print to the standard error the names of input files as they are
opened.
- --copy-input=PATH
- Copy all binary SiLK Flow records read as input to the specified file or
named pipe. PATH may be "stdout"
or "-" to write flows to the standard
output as long as the --output-path switch is specified to redirect
rwaddrcount's textual output to a different location.
- --output-path=PATH
- Write the textual output to PATH, where PATH is a filename,
a named pipe, the keyword "stderr" to
write the output to the standard error, or the keyword
"stdout" or
"-" to write the output to the standard
output (and bypass the paging program). If PATH names an existing
file, rwaddrcount exits with an error unless the SILK_CLOBBER
environment variable is set, in which case PATH is overwritten. If
this switch is not given, the output is either sent to the pager or
written to the standard output.
- --pager=PAGER_PROG
- When output is to a terminal, invoke the program PAGER_PROG to view
the output one screen full at a time. This switch overrides the SILK_PAGER
environment variable, which in turn overrides the PAGER variable. If the
--output-path switch is given or if value of the pager is
determined to be the empty string, no paging is performed and all output
is written to the terminal.
- --site-config-file=FILENAME
- Read the SiLK site configuration from the named file FILENAME. When
this switch is not provided, rwaddrcount searches for the site
configuration file in the locations specified in the "FILES"
section.
- --legacy-timestamps
- --legacy-timestamps=NUM
- When NUM is not specified or is 1, this switch is equivalent to
--timestamp-format=m/d/y. Otherwise, the switch has no effect. This
switch is deprecated as of SiLK 3.0.0, and it will be removed in the SiLK
4.0 release.
- --xargs
- --xargs=FILENAME
- Read the names of the input files from FILENAME or from the
standard input if FILENAME is not provided. The input is expected
to have one filename per line. rwaddrcount opens each named file in
turn and reads records from it as if the filenames had been listed on the
command line.
- --help
- Print the available options and exit.
- --version
- Print the version number and information about how SiLK was configured,
then exit the application.
The following switches are deprecated. They will be removed in SiLK 4.0.
- --byte-min=BYTEMIN
- Deprecated alias for --min-bytes.
- --packet-min=PACKMIN
- Deprecated alias for --min-packets.
- --rec-min=RECMIN
- Deprecated alias for --min-records.
- --byte-max=BYTEMAX
- Deprecated alias for --max-bytes.
- --packet-max=PACKMAX
- Deprecated alias for --max-packets.
- --rec-max=RECMAX
- Deprecated alias for --max-records.
In the following examples, the dollar sign
("$") represents the shell prompt. The text
after the dollar sign represents the command line. Lines have been wrapped for
improved readability, and the back slash
("\") is used to indicate a wrapped line.
To print a list of source IP addresses that appeared in exactly
one TCP record during the first 12 hours of 2003-Sep-01, use:
$ rwfilter --start-date=2003/09/01:00 --end-date=2003/09/01:11 \
--proto=6 --pass=stdout \
| rwaddrcount --max-records=1 --print-ips
In general, to print out record information, use
rwaddrcount with --print-recs
$ rwfilter --start-date=2003/01/17:00 --end-date=2003/01/17:23 \
--proto=6 --pass=stdout \
| rwaddrcount --print-rec --no-title | head -3
10.10.10.1| 65792| 147| 21| 2003/01/17T00:19:01| 2003/01/17T02:00:13|
10.10.10.2| 110744| 89| 7| 2003/01/17T01:21:42| 2003/01/17T01:39:21|
10.10.10.3| 864| 18| 6| 2003/01/17T00:20:33| 2003/01/17T01:25:38|
We note some overlapping features between rwaddrcount and
rwuniq(1). There is often more than one way to perform
the same task in the SiLK tool set.
Here's a guide to replacing each of the outputs of
rwaddrcount:
The --print-recs switch prints five pieces of information
for each source or destination address:
$ rwaddrcount --print-recs data.rw
sIP|Bytes|Packets|Records| Start_Time| End_Time|
10.0.0.144| 1646| 4| 1|2007/05/09T18:01:41|2007/05/09T18:01:41|
10.14.203.121| 40| 1| 1|2007/05/09T18:31:54|2007/05/09T18:31:54|
10.14.203.122| 40| 1| 1|2007/05/09T18:32:43|2007/05/09T18:32:43|
10.15.6.14| 539| 3| 3|2007/05/09T18:03:05|2007/05/09T18:08:07|
12.0.101.22| 4365| 23| 2|2007/05/09T18:26:43|2007/05/09T18:43:46|
To do the same in rwuniq, specify either
"sip" in --fields and the
--values shown here:
$ rwuniq --fields=sip --values=bytes,packets,flows,stime,etime data.rw
sIP|Bytes|Packets|Records| min_sTime| max_eTime|
10.0.0.144| 1646| 4| 1|2007/05/09T18:01:41|2007/05/09T18:01:41|
10.14.203.121| 40| 1| 1|2007/05/09T18:31:54|2007/05/09T18:31:54|
10.14.203.122| 40| 1| 1|2007/05/09T18:32:43|2007/05/09T18:32:43|
10.15.6.14| 539| 3| 3|2007/05/09T18:03:05|2007/05/09T18:08:07|
12.0.101.22| 4365| 23| 2|2007/05/09T18:26:43|2007/05/09T18:43:46|
When rwaddrcount includes --use-dest, change the
--fields switch of rwuniq to
"dip". Replace the --sort-ips
switch of rwaddrcount with --sort-output in rwuniq.
The --print-stat switch in rwaddrcount prints a
one-line summary of the data:
$ rwaddrcount --print-stat data.rw
| sIP_Uniq| Bytes| Packets| Records|
Total| 57727| 948620676| 2026581| 382578|
This is difficult to produce with rwuniq. If there is a
field that you know is either empty or constant across all records (such as
"nhip" or
"in"), you can use that as the key field
in rwuniq.
$ rwuniq --fields=nhIP --values=distinct:sip,bytes,packets,flows data.rw
nhIP|sIP-Distinct| Bytes| Packets| Records|
0.0.0.0| 57727| 948620676| 2026581| 382578|
Note that "class" generally does
not work since each type within a class produces its own row:
$ rwuniq --fields=class --values=distinct:sip,bytes,packets,flows data.rw
class|sIP-Distinct| Bytes| Packets| Records|
all| 8674| 260143344| 964621| 151447|
all| 55540| 688477332| 1061960| 6184399|
One trick is to use "stime" as
the key with a very large --bin-time:
$ rwuniq --fields=stime --bin-time=2147483647 \
--values=distinct:sip,bytes,packets,flows data.rw
sTime|sIP-Distinct| Bytes| Packets| Records|
1970/01/01T00:00:00| 57727| 948620676| 2026581| 382578|
Finally, you can use separate invocations of
rwfilter (1), rwset(1), and
rwsetcat(1):
$ rwfilter --print-volume --all=stdout data.rw \
| rwset --sip=stdout \
| rwsetcat --count-ips
| Recs| Packets| Bytes| Files|
Total| 382578| 2026581| 948620676| 1|
Pass| 382578| 2026581| 948620676| |
Fail| 0| 0| 0| |
57727
rwaddrcount's --print-ips switch prints the IP
addresses as text:
$ rwaddrcount --print-ips data.rw
sIP
10.0.0.144
10.14.203.121
10.14.203.122
10.15.6.14
12.0.101.22
A combination of rwset and rwsetcat is the best way
to handle this:
$ rwset --sip-file=stdout data.rw | rwsetcat --print-ips
10.0.0.144
10.14.203.121
10.14.203.122
10.15.6.14
12.0.101.22
Alternatively, use rwuniq and the UNIX tool
cut (1) to only print the first column:
$ rwuniq --fields=sIP data.rw \
| cut -d '|' -f 1
sIP
10.0.0.144
10.14.203.121
10.14.203.122
10.15.6.14
12.0.101.22
rwaddrcount allows you to restrict the output to bins that
have a certain minimum or maximum count of bytes, packets, or flows via
--min-bytes, --max-bytes, --min-packets,
--max-packets, --min-records, and --max-records:
$ rwaddrcount --print-recs --min-byte=1024 --max-byte=2048 \
--max-records=1 data.rw
sIP|Bytes|Packets|Records| Start_Time| End_Time|
10.0.0.144| 1646| 4| 1|2007/05/09T18:01:41|2007/05/09T18:01:41|
10.14.203.121| 40| 1| 1|2007/05/09T18:31:54|2007/05/09T18:31:54|
10.14.203.122| 40| 1| 1|2007/05/09T18:32:43|2007/05/09T18:32:43|
rwuniq supports the same operations using the
--bytes, --packets, and --flows switches, each of which
allows you to define a desired minimum and maximum value.
$ rwuniq --fields=sip --values=bytes,packets,records,stime,etime \
--bytes=1024-2048 --flows=1-1 data.rw
sIP|Bytes|Packets|Records| min_sTime| max_eTime|
10.0.0.144| 1646| 4| 1|2007/05/09T18:01:41|2007/05/09T18:01:41|
10.14.203.121| 40| 1| 1|2007/05/09T18:31:54|2007/05/09T18:31:54|
10.14.203.122| 40| 1| 1|2007/05/09T18:32:43|2007/05/09T18:32:43|
- SILK_IP_FORMAT
- This environment variable is used as the value for --ip-format when
that switch is not provided. Since SiLK 3.11.0.
- SILK_TIMESTAMP_FORMAT
- This environment variable is used as the value for
--timestamp-format when that switch is not provided. Since
SiLK 3.11.0.
- SILK_PAGER
- When set to a non-empty string, rwaddrcount automatically invokes
this program to display its output a screen at a time. If set to an empty
string, rwaddrcount does not automatically page its output.
- PAGER
- When set and SILK_PAGER is not set, rwaddrcount automatically
invokes this program to display its output a screen at a time.
- SILK_CLOBBER
- The SiLK tools normally refuse to overwrite existing files. Setting
SILK_CLOBBER to a non-empty value removes this restriction.
- SILK_CONFIG_FILE
- This environment variable is used as the value for the
--site-config-file when that switch is not provided.
- SILK_DATA_ROOTDIR
- This environment variable specifies the root directory of data repository.
As described in the "FILES" section, rwaddrcount may use
this environment variable when searching for the SiLK site configuration
file.
- SILK_PATH
- This environment variable gives the root of the install tree. When
searching for configuration files, rwaddrcount may use this
environment variable. See the "FILES" section for details.
- TZ
- When the argument to the --timestamp-format switch includes
"local" or when a SiLK installation is
built to use the local timezone, the value of the TZ environment variable
determines the timezone in which rwaddrcount displays timestamps.
(If both of those are false, the TZ environment variable is ignored.) If
the TZ environment variable is not set, the machine's default timezone is
used. Setting TZ to the empty string or 0 causes timestamps to be
displayed in UTC. For system information on the TZ variable, see
tzset(3) or environ(7). (To
determine if SiLK was built with support for the local timezone, check the
"Timezone support" value in the output
of rwaddrcount --version.)
- ${SILK_CONFIG_FILE}
- ${SILK_DATA_ROOTDIR}/silk.conf
- /data/silk.conf
- ${SILK_PATH}/share/silk/silk.conf
- ${SILK_PATH}/share/silk.conf
- /usr/local/share/silk/silk.conf
- /usr/local/share/silk.conf
- Possible locations for the SiLK site configuration file which are checked
when the --site-config-file switch is not provided.
rwset(1), rwsetcat(1),
rwstats(1), rwtotal(1),
rwuniq(1), silk(7),
tzset (3), environ(7)
rwaddrcount only supports IPv4 addresses, and it will not be modified to
support IPv6 addresses. To produce output similar to rwaddrcount for
IPv6 addresses, use rwuniq(1):
rwuniq --fields=sip --values=bytes,packets,records,stime,etime
When used in an IPv6 environment, rwaddrcount converts IPv6
flow records that contain addresses in the ::ffff:0:0/96 prefix to IPv4 and
processes them. IPv6 records having addresses outside of that prefix are
ignored.
rwaddrcount uses a fairly large hashtable to store data,
but it is likely that as the amount of data expands, the application will
take more time to process data.
Similar binning of records are produced by
rwstats(1), rwtotal(1), and
rwuniq(1).
To generate a list of IP addresses without the volume information,
use rwset(1).
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |