|
|
| |
rwfilter(1) |
SiLK Tool Suite |
rwfilter(1) |
rwfilter - Choose which SiLK Flow records to process
rwfilter INPUT_ARGS OUTPUT_ARGS PARTITIONING_ARGS [MISC_ARGS]
Selection switches, input switches, or input files are
required:
rwfilter ...
{{ [--class=CLASS] [--type={all | TYPE[,TYPE ...]}]
| [--flowtypes=CLASS/TYPE[,CLASS/TYPE ...]] }
[--sensors=SENSOR[,SENSOR ...]]
[--start-date=YYYY/MM/DD[:HH] [--end-date=YYYY/MM/DD[:HH]]]
[--data-rootdir=ROOT_DIRECTORY] [--print-missing-files] }
| [--input-pipe=INPUT_PATH]
| [--xargs] | [--xargs=INPUT_PATH]
| [INPUT_PATH [INPUT_PATH...]]
One or more output switches are required:
rwfilter ...
[--all-destination=ALL_PATH [--all-destination=ALL_PATH ...]]
[--fail-destination=FAIL_PATH [--fail-destination=FAIL_PATH ...]]
[--pass-destination=PASS_PATH [--pass-destination=PASS_PATH ...]]
[{ --print-statistics[=STATS_PATH]
| --print-volume-statistics[=STATS_PATH] }]
One or more partitioning switches are required:
rwfilter ...
[--ack-flag=SCALAR] [--active-time=TIME_WINDOW]
[{--any-address=IP_WILDCARD | --not-any-address=IP_WILDCARD}]
[--any-cc=COUNTRY_CODE_LIST]
[{--any-cidr=IP_OR_CIDR_LIST | --not-any-cidr=IP_OR_CIDR_LIST}]
[--any-index=INTEGER_LIST]
[{--anyset=IP_SET_FILENAME | --not-anyset=IP_SET_FILENAME}]
[--aport=INTEGER_LIST] [--application=INTEGER_LIST]
[--attributes=ATTRIBUTES_LIST]
[--bytes=INTEGER_RANGE] [--bytes-per-packet=DECIMAL_RANGE]
[--cwr-flag=SCALAR]
[{--daddress=IP_WILDCARD | --not-daddress=IP_WILDCARD}]
[--dcc=COUNTRY_CODE_LIST]
[{--dcidr=IP_OR_CIDR_LIST | --not-dcidr=IP_OR_CIDR_LIST}]
[{--dipset=IP_SET_FILENAME | --not-dipset=IP_SET_FILENAME}]
[--dport=INTEGER_LIST] [--dtype=SCALAR]
[--duration=DECIMAL_RANGE] [--ece-flag=SCALAR]
[--etime=TIME_WINDOW] [--fin-flag=SCALAR]
[--flags-all=HIGH_MASK_FLAGS_LIST]
[--flags-initial=HIGH_MASK_FLAGS_LIST]
[--flags-session=HIGH_MASK_FLAGS_LIST]
[--icmp-code=INTEGER_LIST] [--icmp-type=INTEGER_LIST]
[--input-index=INTEGER_LIST] [--ip-version=INTEGER_LIST]
[--ipa-src-expr=IPA_EXPR] [--ipa-dst-expr=IPA_EXPR]
[--ipa-any-expr=IPA_EXPR]
[{--next-hop-id=IP_WILDCARD | --not-next-hop-id=IP_WILDCARD}]
[{--nhcidr=IP_OR_CIDR_LIST | --not-nhcidr=IP_OR_CIDR_LIST}]
[{--nhipset=IP_SET_FILENAME | --not-nhipset=IP_SET_FILENAME}]
[--output-index=INTEGER_LIST] [--packets=INTEGER_RANGE]
[--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]
{ [--pmap-src-MAPNAME=LABELS] [--pmap-dst-MAPNAME=LABELS]
[--pmap-any-MAPNAME=LABELS] } ]
[--protocol=INTEGER_LIST] [--psh-flag=SCALAR]
[--python-expr=PYTHON_EXPR]
[--python-file=FILENAME [--python-file=FILENAME ...]]
[--rst-flag=SCALAR]
[{--saddress=IP_WILDCARD | --not-saddress=IP_WILDCARD}]
[--scc=COUNTRY_CODE_LIST]
[{--scidr=IP_OR_CIDR_LIST | --not-scidr=IP_OR_CIDR_LIST}]
[{--sipset=IP_SET_FILENAME | --not-sipset=IP_SET_FILENAME}]
[--sport=INTEGER_LIST] [--stime=TIME_WINDOW] [--stype=SCALAR]
[--syn-flag=SCALAR] [--tcp-flags=TCP_FLAGS]
[--tuple-file=TUPLE_FILENAME { [--tuple-fields=FIELDS]
[--tuple-direction=DIRECTION]
[--tuple-delimiter=CHAR] } ]
[--urg-flag=SCALAR]
Miscellaneous switches:
rwfilter ...
[--compression-method=COMP_METHOD] [--dry-run]
[--max-fail-records=N] [--max-pass-records=N]
[--note-add=TEXT] [--note-file-add=FILE]
[--plugin=PLUGIN [--plugin=PLUGIN ...]]
[--print-filenames] [--site-config-file=FILENAME]
[--threads=N]
Help switches:
rwfilter [--pmap-file=MAPNAME:PATH [--pmap-file=MAPNAME:PATH ...]]
[--plugin=PLUGIN ...] [--python-file=PATH]
[--data-rootdir=ROOT_DIRECTORY] [--site-config-file=FILENAME]
--help
rwfilter --version
rwfilter serves two purposes: (1) It acts as an interface to the data
store to select which SiLK Flow records to process, and (2) it partitions
those records into one or more pass and/or fail streams.
The "Selection Switches" let one choose flow records
from the SiLK data store by specifying where the flow was collected (its
sensor), the date of collection, and/or the flow's direction. The act of
selecting records from the data store is sometimes called a "data
pull".
The "Partitioning Switches" describe various types of
traffic behavior (e.g., TCP traffic, or all traffic going to port 80). When
a flow record matches all of the behaviors, it can be written to a
pass stream (i.e., file). If a record fails to match any of these
behavior predicates, it can be written to a fail stream. (You may
also write every record rwfilter reads to an all stream.)
These output streams from rwfilter are always binary SiLK Flow
records. The output must be either written to a file or piped into another
tool in the SiLK Suite, and rwfilter complains if it determines you
are attempting to send the stream to a terminal. To view the records, pipe
the records into rwcut(1).
In addition to the partitioning switches built in to
rwfilter, additional partitioning predicates can be created as C or
PySiLK plug-ins, and these can be loaded into rwfilter using the
--plugin and/or --python-file switches as described below.
Instead of using the selection switches to choose flow records
from the data store, rwfilter can apply the partitioning switches to
existing files of SiLK flow records---such as files generated by a previous
invocation of rwfilter. To run rwfilter in this mode, you
may
- specify, on the command line, the files and/or named pipes from which
rwfilter should read SiLK Flow records. Specifying
"stdin" or
"-" or the command line causes
rwfilter to read flow records from the standard input.
- use the --input-pipe switch to specify a named pipe, or specify
"stdin" or
"-" as the argument to this switch to
have rwfilter read flow records from the standard input.
- use the --xargs switch to specify a file that contains the names of
the input files to process. When --xargs is used without an
argument, rwfilter attempts to read the names of the file from the
standard input. The name of each input file must appear on a single
line.
When rwfilter is reading flow records from input files,
some of the selection switches act as partitioning switches. The remaining
selection switches may not be specified when using the alternate forms of
input, and it is an error to specify multiple types of input.
Unlike many other tools in the SiLK tool suite, rwfilter
requires that you specify one or more "Output Switches" that tell
rwfilter what types of output to produce.
Finally, there are "Miscellaneous Switches" that control
other aspects of rwfilter.
Option names may be abbreviated if the abbreviation is unique or is an exact
match for an option. A parameter to an option may be specified as
--arg=param or --arg param, though the first form
is required for options that take optional parameters.
To read files from the data store, use the following options to specify which
files to process. When rwfilter gets its input from files listed on the
command line or from the --xargs or --input-pipe switches, the
first four switches (--class, --type, --flowtypes, and
--sensors) act as partitioning switches, and specifying any other
selection switch produces an error.
- --class=CLASS
- The --class switch is used to specify a group of data to process.
Only a single class may be selected with the --class switch; for
multiple classes, use the --flowtypes switch. Classes are defined
in the silk.conf(5) site configuration file. If the
--class option is not given, the default-class as specified in
silk.conf is used. To see the available classes and the default
class, either examine the output from rwfilter --help or invoke
rwsiteinfo(1) with the switch
--fields=class,default-class.
- --type={"all" | TYPE[,TYPE]}
- The --type predicate further specifies data within the selected
CLASS by listing the TYPEs of traffic to process. The switch
takes a comma-separated list of types or the keyword
"all" which specifies all types for the
specified CLASS. Types are defined in silk.conf, they
typically refer to the direction of the flow, and they may vary by class.
When the --type switch is not specified, a list of default types is
used. The default-type list is determined by the value of CLASS,
and the default types generally include only incoming traffic. To see the
available types and the default types for each class, examine the
--help output of rwfilter or run rwsiteinfo with
--fields=class,type,default-type.
- --flowtypes=CLASS/TYPE[,CLASS/TYPE
...]
- The --flowtypes predicate provides an alternate way to specify
class/type pairs. The --flowtypes switch allows a single
rwfilter invocation to process data from multiple classes. The
keyword "all" may be used for the
CLASS and/or TYPE to select all classes and/or types.
- --sensors=SENSOR[,SENSOR ...]
- The --sensor switch is used to select data from specific sensors.
The parameter is a comma separated list of sensor names, sensor IDs
(integers), and/or ranges of sensor IDs. Sensors are defined in the
silk.conf(5) site configuration file, and the
rwsiteinfo(1) command can be used to print a mapping
of sensor names to IDs and classes. When the --sensor switch is not
specified, the default is to use all sensors which are valid for the
specified class(es).
- --start-date=YYYY/MM/DD[:HH]
- --end-date=YYYY/MM/DD[:HH]
- The date predicates indicate which days and hours to consider when
creating the list of files. The dates may be expressed as seconds since
the UNIX epoch or in "YYYY/MM/DD[:HH]"
format, where the hour is optional. A
"T" may be used in place of the
":" to separate the day and hour.
Whether the "YYYY/MM/DD[:HH]" strings
represent times in UTC or the local timezone depend on how SiLK was
compiled. To determine how your version of SiLK was compiled, see the
"Timezone
support" setting in the output from
rwfilter --version.
When times are expressed in
"YYYY/MM/DD[:HH]" format:
- When both --start-date and --end-date are specified to hour
precision, all hours within that time range are processed.
- When --start-date is specified to day precision, the hour specified
in --end-date (if any) is ignored, and files for all dates between
midnight on start-date and 23:59 on end-date are
processed.
- When --start-date is specified to hour precision and
--end-date is specified to day precision, the hour of the
start-date is used as the hour for the end-date.
- When --end-date is not specified and --start-date is
specified to day precision, files for that complete day are
processed.
- When --end-date is not specified and --start-date is
specified to hour precision, files for that single hour are
processed.
When at least one time is expressed as seconds since the UNIX
epoch:
- When --end-date is specified in epoch seconds, the given
--start-date and --end-date are considered to be in hour
precision.
- When --start-date is specified in epoch seconds and
--end-date is specified in
"YYYY/MM/DD[:HH]" format, the start-date
is considered to be in day precision if it divisible by 86400, and hour
precision otherwise.
- When --start-date is specified in epoch seconds and
--end-date is not given, the start-date is considered to be in
hour-precision.
When neither --start-date nor --end-date is given,
rwfilter processes all files for the current day.
It is an error to specify --end-date without specifying
--start-date.
It is an error to specify --start-date when rwfilter
believes there is some other input specified (see "Non-Selection Input
Switches").
- --data-rootdir=ROOT_DIRECTORY
- Tell rwfilter to use ROOT_DIRECTORY as the root of the data
repository, which overrides the location given in the SILK_DATA_ROOTDIR
environment variable, which in turn overrides the location that was
compiled into rwfilter (/data). It is an error to specify this
switch when files are specified on the command line or "Non-Selection
Input Switches" are given.
- --print-missing-files
- This option prints to the standard error the names of the files that
rwfilter's file selection switches expected to find but did not.
The file names are preceded by the text 'Missing '; each file name
appears on a separate line. This switch is useful for debugging, but the
list of files it produces can be misleading. For example, suppose there is
a decommissioned sensor that still appears in the silk.conf file;
rwfilter considers these data files as missing even though
their absence is expected. Use the output from this switch judiciously. It
is an error to specify this switch when files are specified on the command
line or "Non-Selection Input Switches" are given.
Instead of using the "Selection Switches" to read flow records from
files in the data store, you can tell rwfilter to process files named
on the command line or use one (and only one) of the following switches. To
have rwfilter read flow records from the standard input, specify
"stdin" or
"-" as the name of an input file or use the
(deprecated) --input-pipe switch.
- --xargs
- --xargs=INPUT_PATH
- Read the names of the input files from INPUT_PATH or from the
standard input if INPUT_PATH is not provided. The input is expected
to have one filename per line. rwfilter opens each named file in
turn and reads records from it as if the filenames had been listed on the
command line.
- --input-pipe=INPUT_PATH
- Specify a source for SiLK Flow records, where INPUT_PATH is a named
pipe or the string "stdin" or
"-" to represent the standard input. You
do not need to use this switch, you can simply specify the named pipe or
the strings "stdin" or
"-" on the command line. NOTE:
This switch is deprecated, and it will be removed in the SiLK 4.0
release.
At least one of the following output switches must be provided:
- --all-destination=ALL_PATH
- Write every SiLK Flow record to ALL_PATH, where ALL_PATH
refers to a file, a named pipe, the string
"stderr" to refer to the standard error,
or the strings "stdout" or
"-" to refer to the standard output.
This switch may be repeated to write all input records to multiple
locations.
- --fail-destination=FAIL_PATH
- Write SiLK Flow records that have failed ANY of the partitioning
predicates to FAIL_PATH, where FAIL_PATH refers to a
non-existent file, a named pipe, the string
"stderr" to refer to the standard error,
or the strings "stdout" or
"-" to refer to the standard output.
This switch may be repeated to write records that fail any predicate to
multiple locations.
- --pass-destination=PASS_PATH
- Write SiLK Flow records that have passed ALL of the partitioning
predicates to PASS_PATH, where PASS_PATH refers to a
non-existent file, a named pipe, the string
"stderr" to refer to the standard error,
or the strings "stdout" or
"-" to refer to the standard output.
This switch may be repeated to write records that pass every predicate to
multiple locations.
- --print-statistics
- --print-statistics=STATS_PATH
- Print a one line summary specifying the number of files processed, the
total number of records read, the number of records that passed all
partitioning predicates, and the number of records that failed. If
STATS_PATH is provided, the summary is printed there; otherwise it
is printed to the standard error. This switch cannot be mixed with
--print-volume-statistics. When running rwfilter with
multiple threads and --max-pass-records or
--max-fail-records is specified, the statistics may not match the
number of records written by rwfilter.
- --print-volume-statistics
- --print-volume-statistics=STATS_PATH
- Print a four line summary of rwfilter's processing. For each of all
records, records that pass all the partitioning predicates, and records
that fail, print the number of flow records and the number of packets and
bytes represented by those flow records. The output also includes the
number of files processed. If STATS_PATH is provided, the summary
is printed there; otherwise it is printed to the standard error. This
switch cannot be mixed with --print-statistics. When running
rwfilter with multiple threads and --max-pass-records or
--max-fail-records is specified, the statistics may not match the
number of records written by rwfilter.
rwfilter supports the following partitioning switches, at least one of
which must be specified (unless the only Output Switch is
--all-destination). The switches are AND'ed together; i.e., to
pass the filter, the record must pass the test implied by each switch. Any
record that does not pass is written to the fail-destination(s), if
specified.
Each partitioning switch defines a test. These tests can be
grouped into several broad categories; within each category, the tests are
applied in the order in which the switches appear on the command line. The
categories of the partitioning tests are:
- tests for IP addresses (including the IPset checks), ports, protocol,
times, TCP flags, byte and packet counts, IP version, application, country
codes
- tests based on the --tuple-file switch
- tests that use the address type or prefix map mapping files
- tests that use the IP-Association plug-in
- tests based on the --python-expr and --python-file
switches
- tests defined in C-plugins and loaded via --plugin
Partitioning Switches for IP Addresses
There are three families of switches that partition based on an IP
address. Each family can partition by the source IP, the destination IP, the
next hop IP, or either source or destination IP. Each family includes a
--not-* variant to reverse the sense of the test.
The --*cidr-family takes as its argument an
IP_OR_CIDR_LIST, which is a single IP address
10.1.2.3, a single CIDR block
"FF01::/16", or a comma separated list of
IPs and/or CIDR blocks
"10.0.1.0/24,10.0.2.3,10.0.4.0/24". The
IP_OR_CIDR_LIST supports IPv4 and IPv6 addresses.
The --*set-family requires that you store the IPs in a
binary IPset file and pass the name of the file to the switch. IPset files
are created from SiLK Flow records with rwset(1), or
from textual input with rwsetbuild(1).
The --*address-family (which includes --next-hop-id)
takes as its argument a single IP address, a single CIDR block, or a single
SiLK IP Wildcard. A SiLK IP Wildcard may represent multiple, disjointed IPv4
or IPv6 addresses. An IP Wildcard contains an IP in its canonical form,
except each part of the IP (where part is an octet for IPv4 or a
hexadectet for IPv6) may be a single value, a range, a comma separated list
of values and ranges, or the letter "x" to
signify any value for that part of the IP (that is,
"0-255" for IPv4). You may not specify a
CIDR suffix when using the IP Wildcard notation. The following
IP_WILDCARDs all represent the same value:
::ffff:0:0/112
::ffff:0:x
::ffff:0:aaab-ffff,aaaa,0-aaa9
::ffff:0.0.0.0/112
::ffff:0.0.128-254,0-126,255,127.x
The next hop address often has a value of 0.0.0.0 since the
default configuration of SiLK does not store the next hop address in the
data repository.
With one restriction, any combination of IP partitioning switches
is allowed in a single rwfilter invocation: A positive and negative
version of the same switch (e.g., --sipset and --not-sipset)
is not allowed. (--sipset and --not-scidr may be used
together, as can --sipset and --not-dipset.)
The address-partitioning switches are:
- --scidr=IP_OR_CIDR_LIST
- Pass the record if its source IP address matches a value in
IP_OR_CIDR_LIST, a comma separated list of IPs and/or CIDR blocks.
See also --saddress and --sipset.
- --dcidr=IP_OR_CIDR_LIST
- Pass the record if its destination IP address matches a value in
IP_OR_CIDR_LIST. See also --daddress and
--dipset.
- --any-cidr=IP_OR_CIDR_LIST
- Pass the record if either its source or its destination IP address matches
a value in IP_OR_CIDR_LIST. This switch does not consider
the next hop IP address. See also --any-address and
--anyset.
- --nhcidr=IP_OR_CIDR_LIST
- Pass the record if its next hop IP address matches a value in
IP_OR_CIDR_LIST. See also --next-hop-id and
--nhipset.
- --not-scidr=IP_OR_CIDR_LIST
- Pass the record if its source IP address does not match a value in
IP_OR_CIDR_LIST, a comma separated list of IPs and/or CIDR blocks.
See also --not-saddress and --not-sipset.
- --not-dcidr=IP_OR_CIDR_LIST
- Pass the record if its destination IP address does not match a value in
IP_OR_CIDR_LIST. See also --not-daddress and
--not-dipset.
- --not-any-cidr=IP_OR_CIDR_LIST
- Pass the record if neither its source nor its destination IP address
matches a value in IP_OR_CIDR_LIST. See also
--not-any-address and --not-anyset.
- --not-nhcidr=IP_OR_CIDR_LIST
- Pass the record if its next hop IP address does not match a value in
IP_OR_CIDR_LIST. See also --not-next-hop-id and
--not-nhipset.
- --saddress=IP_WILDCARD
- Pass the record if its source IP address is matched by the SiLK IP
Wildcard IP_WILDCARD. To match on multiple IPs, use --scidr
or create an IPset and use --sipset.
- --daddress=IP_WILDCARD
- Pass the record if its destination IP address is matched by
IP_WILDCARD, a SiLK IP Wildcard. See also --dcidr and
--dipset.
- --any-address=IP_WILDCARD
- Pass the record if either its source or its destination IP address is
matched by IP_WILDCARD, a SiLK IP Wildcard. This switch does
not consider the next hop IP address. See also --any-cidr
and --anyset.
- --next-hop-id=IP_WILDCARD
- Pass the record if its next hop IP address is matched by this
IP_WILDCARD, a SiLK IP Wildcard. To match on multiple IPs, use
--nhcidr or create an IPset and use --nhipset.
- --not-saddress=IP_WILDCARD
- Pass the record if its source IP address is not matched by this
IP_WILDCARD, a SiLK IP Wildcard. See also --not-scidr and
--not-sipset.
- --not-daddress=IP_WILDCARD
- Pass the record if its destination IP address is not matched by
this IP_WILDCARD. See also --not-dcidr and
--not-dipset.
- --not-any-address=IP_WILDCARD
- Pass the record if neither its source nor its destination IP address is
matched by this IP_WILDCARD. Does not consider the next hop
address. See also --not-any-cidr and --not-anyset.
- --not-next-hop-id=IP_WILDCARD
- Pass the record if its next hop IP address is not matched by this
IP_WILDCARD. See also --not-nhcidr and
--not-nhipset.
- --sipset=IP_SET_FILENAME
- Pass the record if its source IP address is in the list of IPs contained
in the binary set file IP_SET_FILENAME. See also
--scidr.
- --dipset=IP_SET_FILENAME
- As --sipset for the destination IP address. See also
--dcidr.
- --anyset=IP_SET_FILENAME
- Pass the record if either its source IP address or its destination IP
address is in the list of IPs contained in the binary set file
IP_SET_FILENAME. Does not consider the next hop IP. See also
--any-cidr.
- --nhipset=IP_SET_FILENAME
- As --sipset for the next-hop IP address. See also
--nhcidr.
- --not-sipset=IP_SET_FILENAME
- Pass the record if its source IP address is not in the list of IPs
contained in the binary set file IP_SET_FILENAME. See also
--not-scidr.
- --not-dipset=IP_SET_FILENAME
- As --not-sipset for the destination IP address. See also
--not-dcidr.
- --not-anyset=IP_SET_FILENAME
- Pass the record if neither its source IP address nor its destination IP
address is in the list of IPs contained in the binary set file
IP_SET_FILENAME. Does not consider the next hop IP. See also
--not-any-cidr.
- --not-nhipset=IP_SET_FILENAME
- As --not-sipset for the next hop IP address. See also
--not-nhcidr.
Partitioning Switches for Remainder of Five-Tuple
The following switches partition based on the protocol and source
or destination port. The parameter to each of these switches is an
INTEGER_LIST, which is a comma-separated list of individual
non-negative integer values and ranges of those values. For example,
"1,2,3,5-10,99-103". A range may be
specified without an upper limit, such as
"1-", in which case the upper limit is set
to the maximum value.
- --sport=INTEGER_LIST
- Pass the record if its source port is in this INTEGER_LIST,
possible values are 0-65535.
- --dport=INTEGER_LIST
- Pass the record if its destination port is in this INTEGER_LIST,
possible values are 0-65535
- --aport=INTEGER_LIST
- Pass the record if its source port and/or its destination port is in this
INTEGER_LIST, possible values are 0-65535. For example, use
--aport=25 to see all SMTP conversions regardless or where
they originated.
- --protocol=INTEGER_LIST
- Pass the record if its IP Suite Protocol is in this INTEGER_LIST,
possible values are 0-255.
- --icmp-type=INTEGER_LIST
- Pass the record if its ICMP (or ICMPv6) type is in this
INTEGER_LIST; possible values 0-255. This switch also verifies that
the flow's protocol is 1 (or 58 if the flow is IPv6). It is an error to
specify a --protocol that does not include 1 and/or 58.
- --icmp-code=INTEGER_LIST
- Pass the record if its ICMP (or ICMPv6) code is in this
INTEGER_LIST; possible values 0-255. This switch also verifies that
the flow's protocol is 1 (or 58 if the flow is IPv6). It is an error to
specify a --protocol that does not include 1 and/or 58.
Partitioning Switches for Time
These switches partition based on whether the time stamps on the
flow record occur within the specified time window. The form of the argument
is range of two dates, start-window and end-window, each in the form
"YYYY/MM/DD[:HH[:MM[:SS[.ssssss]]]]", for
example
"2003/01/31:23:45:00.000-2003/01/31:23:59:59.999"
represents the last fifteen minutes of Jan 31, 2003. (A
"T" may be used in place of
":" to separate the day and hour.) The
start-window and end-window must be set to at least day precision. For the
start-window, unspecified hour, minute, second, and millisecond values are
set to 0; for the end-window, those values are set to 23, 59, 59, and 999
respectively. Thus
"2003/01/31:23-2003/01/31:23" becomes
"2003/01/31:23:00:00.000-2003/01/31:23:59:59.999".
If an end-window is not given, it is set to the start-window, giving a
window of a single millisecond. The date strings are considered to be in the
timezone specified when SiLK was compiled, which you can determine from the
output of rwfilter --version. You may also specify the times as
seconds since the UNIX epoch; when the end-time is in epoch seconds, an
unspecified milliseconds value is set to 999 and otherwise the value is
unchanged.
- --active-time=TIME_WINDOW
- Pass the record if the record was active at ANY time during this
TIME_WINDOW. If a single time is specified, pass the record if it
was active at that instant.
- --stime=TIME_WINDOW
- Pass the record if its starting time is in this TIME_WINDOW.
- --etime=TIME_WINDOW
- As --stime for the ending time.
- --duration=DECIMAL_RANGE
- Pass the record if its duration--that is, the record's end time minus its
start time, as measured in seconds--is in this DECIMAL_RANGE. Use
floating point numbers to specify millisecond values. The range should be
specified as MIN-MAX; for example,
"5.0-10.031". If a single value is
given, the duration must match that value exactly. The upper limit may be
omitted; for example, a range of "1.5-"
passes records whose duration is at least 1.5 seconds.
Partitioning Switches for Volume
The following switches partition based on the volume of the flow;
that is, the number of bytes or packets. For additional volume-related
switches, load the flowrate plug-in as described in the
flowrate(3) manual page.
These switches accept a range of non-negative integers or decimal
values. If the upper limit is omitted, the volume must be at least that
size. If the argument is a single value, the volume must match that value
exactly.
- --bytes=INTEGER_RANGE
- Pass the record if its byte count is in this INTEGER_RANGE.
- --packets=INTEGER_RANGE
- Pass the record if its packet count is in this INTEGER_RANGE.
- --bytes-per-packet=DECIMAL_RANGE
- Pass the record if its average bytes per packet count (bytes/packet) is in
this DECIMAL_RANGE.
Partitioning Switches for TCP Flags
When a flow generator creates a flow record from TCP packets, it
creates a field that is the bit-wise OR of the TCP flags from all packets
that comprise that flow record. Some flow generators, such as
yaf (1), can export two TCP flag fields: one contains
the flags on the first packet in the flow, and the second contains the
bit-wise OR of the remaining packets.
To partition records based on their TCP flags values, there is a
recommended set of switches and legacy-supported switches. The switches
accept the following letters to represent the named TCP flag:
"F"=FIN;
"S"=SYN;
"R"=RST;
"P"=PSH;
"A"=ACK;
"U"=URG;
"E"=ECE;
"C"=CWR.
The recommended set of switches take a comma separated list of
pairs of TCP flags, where the pair is separated by a slash (/). The
value to the left of the slash is the HIGH_SET and it must be a
subset of the value to the right of the slash, which is the MASK_SET.
For a record to pass the filter, the flags in the HIGH_SET must be on
and the remaining flags in MASK_SET must be off. Flags not in
MASK_SET may have any value. If a list of pairs is given, the record
passes if any pair in the list matches. For example,
"--flags-all=S/S,A/A" passes flows that
have either the SYN or the ACK flag set,
"--flags-all=S/SA" passes flow records
where SYN is high and ACK is low, and
"--flags-all=/F" passes flows where FIN is
off. This list of flag pairs is called a HIGH_MASK_FLAGS_LIST.
The recommended switches for TCP flag partitioning are:
- --flags-all=HIGH_MASK_FLAGS_LIST
- Pass the record if any of the HIGH_SET/MASK_SET pairs is
true when looking at the bit-wise OR of the TCP flags across all
packets in the flow.
- --flags-initial=HIGH_MASK_FLAGS_LIST
- As --flags-all, except this switch considers only the initial
packet in the flow, for flow generators that can generate that field.
- --flags-session=HIGH_MASK_FLAGS_LIST
- As --flags-all, except this switch considers the bit-wise OR of the
TCP flags across the second through the final packet in the flow; that is,
ignoring the flags on the first packet.
The TCP-flag partitioning switches supported for legacy reasons
are:
- --tcp-flags=TCP_FLAGS
- Pass the record if, for any one of its packets, any of the
specified TCP_FLAGS was on, where TCP_FLAGS contains the
letters
"F","S","R","P","A","U","E","C".
For example, --tcp-flags=ASF passes records where ACK is set, or
SYN is set, or FIN is set.
- --ack-flag={0|1}
- Set to 0, only passes records where the ACK Flag is Low, Set to 1, only
passes records where the ACK Flag is high.
- --cwr-flag={0|1}
- As --ack-flag for the CWR Flag
- --ece-flag={0|1}
- As --ack-flag for the ECE Flag
- --fin-flag={0|1}
- As --ack-flag for the ACK Flag
- --psh-flag={0|1}
- As --ack-flag for the PSH Flag
- --rst-flag={0|1}
- As --ack-flag for the RST Flag
- --syn-flag={0|1}
- As --ack-flag for the SYN Flag
- --urg-flag={0|1}
- As --ack-flag for the URG Flag
Partitioning Switches for Other Flow Characteristics
Other than the --ip-version switch, the fields queried by
the following switches may always be zero. The default configuration of SiLK
does not store the fields that contain the SNMP values. The other fields are
not present in NetFlow v5, and require use of properly-configured enhanced
collection software, such as yaf(1),
<http://tools.netsa.cert.org/yaf/>.
- --ip-version={4|6|4,6}
- Passes the record if its IP Version is in the specified list. This switch
determines how IPv4 and IPv6 flow records are handled when SiLK has been
compiled with IPv6 support. When the argument to this switch is
4, rwfilter writes records marked as IPv6
to the fail-destination, regardless of the IP addresses it contains. When
the argument to this switch is 6, rwfilter
writes records marked as IPv4 to the fail-destination. When SiLK has not
been compiled with IPv6 support, the only legal value for this switch is
4, and any IPv6 flows in the input ignored (that
is, they are not written to either the pass-destination nor the
fail-destination).
- --application=INTEGER_LIST
- Some flow generation software can inspect the contents of the packets that
comprise a flow and use traffic signatures to label the content of the
flow. SiLK calls this label the application; yaf refers to
it as the appLabel (see the applabel(1) manual
page in the yaf distribution). The application value is the port number
that is traditionally used for that type of traffic (see the
/etc/services file on most UNIX systems). For example, traffic that
the flow generator recognizes as FTP has a value of 21, even if that
traffic is being routed through the standard HTTP/web port (80).
The flow generator uses a value for 0 if the application cannot be
determined. The --application switch passes the flow if the flow's
application value is in the specified INTEGER_LIST, which is a
comma separated list of integers from 0 to 65535 inclusive and ranges of
those integers. The list of valid appLabels is determined by your site's
yaf installation.
- --attributes=ATTRIBUTES_LIST
- The attributes field in SiLK Flow records describes characteristics
about how the flow record was generated or about the packets that comprise
the flow record. The ATTRIBUTES_LIST argument is similar to the
HIGH_MASK_FLAGS_LIST argument to the --flags-all switch.
ATTRIBUTES_LIST is a comma separated list of up to 8
HIGH_ATTRIBUTES/MASK_ATTRIBUTES pairs, where
HIGH_ATTRIBUTES and MASK_ATTRIBUTES are strings of the
characters
"S","T","C","F",
and HIGH_ATTRIBUTES is a subset of MASK_ATTRIBUTES.
rwfilter passes the record if, for any pair of attributes in the
list, the attributes listed in HIGH_ATTRIBUTES are set and the
remaining attributes in MASK_ATTRIBUTES are not-set. The valid
attributes are:
- "S"
- All the packets in this flow record are exactly the same size.
- "T"
- The flow generator prematurely created a record for a long-lived session
due to the connection's lifetime reaching the active timeout
of the flow generator. (Also, when yaf is run with the
--silk switch, it prematurely creates a flow and marks it with
"T" if the byte count of the flow cannot
be stored in a 32-bit value.)
- "C"
- The flow generator created this flow as a continuation of long-running
connection, where the previous flow for this connection met a
timeout.
- "F"
- The flow generator saw additional packets in this flow following a packet
with the FIN flag set (excluding ACK packets).
For a long-lived connection spanning several flow records, the
first flow record is marked with a "T"
indicating that it hit the active timeout. The second through next-to-last
records are marked with "CT" indicating
that the flow is a continuation of a connection that timed out and that this
flow also timed out. The final flow is marked with a
"C", indicating that it was created as a
continuation of an active flow.
- --input-index=INTEGER_LIST
- Pass the record if its "in" field is in
this INTEGER_LIST, which is a comma separated list of integers from
0 to 65535, inclusive, and ranges of those integers. When present, the
"in" field normally contains the
incoming SNMP interface, but it may contain the vlanId if the packing
tools were configured to capture it (see
sensor.conf(5)).
- --output-index=INTEGER_LIST
- Pass the record if its "out" field is in
this INTEGER_LIST. When present, the
"out" field normally contains the
outgoing SNMP interface, but it may contain the postVlanId if the packing
tools were configured to capture it.
- --any-index=INTEGER_LIST
- Pass the record if its "in" field or if
its "out" field is in this
INTEGER_LIST.
Selection Switches Acting as Partitioning Switches
The following four switches are normally file selection switches,
that is they select which files rwfilter reads within the data
repository. However, when rwfilter gets input without querying the
data repository (that is, from files listed on the command line, from files
specified by --xargs, or from the --input-pipe), these
switches become partitioning switches and determine whether a record is
written to the pass-destination or fail-destination.
- --class=CLASS
- Pass the record if its class is CLASS and its type is listed in the
--type switch, or its type is in the default type list for
CLASS when --type is not specified. Use rwfilter
--help to see the list of available classes and types, and the
defaults.
- --flowtypes=CLASS/TYPE[,CLASS/TYPE
...]
- Pass the record its if class/type value is one of those listed. The
keyword "all" may be used for the
CLASS and/or TYPE to select all classes and/or types. This
switch cannot be used when either --class or --type is used.
Use rwfilter --help to see the list of available classes and
types.
- --sensors=SENSOR[,SENSOR ...]
- Pass the record if its sensor is one of those listed. The parameter is a
comma separated list of sensor names, sensor IDs (integers), and/or ranges
of sensor IDs. Use the rwsiteinfo(1) command to see
the list of sensors.
- --type={"all" | TYPE[,TYPE]}
- Pass the record if its type is one of those listed and its class is
specified by --class, or its class is the default class when the
--class switch is not specified. Use rwfilter --help to see
the list of available classes and types, and the defaults.
Partitioning Switches that use Additional Mapping Files
Additional partitioning switches are available that allow one to
partition flow records depending on a label, where the label is computed
from an IP address or port on the record and an additional mapping file.
- --pmap-file=PATH
- --pmap-file=MAPNAME:PATH
- Load the prefix map file located at PATH and create partitioning
switches named --pmap-src-map-name,
--pmap-dst-map-name, and
--pmap-any-map-name where map-name is
either the MAPNAME part of the argument or the map-name specified
when the file was created (see rwpmapbuild(1)). If no
map-name is available, rwfilter creates switch names as described
below (--pmap-saddress, --pmap-sport-proto, etc). Specify
PATH as "-" or
"stdin" to read from the standard input.
The switch may be repeated to load multiple prefix map files; each file
must have a unique map-name. The --pmap-file switch(es) must
precede all other --pmap-* switches. For more information, see
pmapfilter(3).
- --pmap-src-map-name=LABELS
- If the prefix map associated with map-name is an IP prefix map,
this matches records with a source IPv4 address that maps to a label
contained in the list of labels in LABELS. If the prefix map
associated with map-name is a proto-port prefix map, this matches
records with a protocol and source port combination that maps to a label
contained in the list of labels in LABELS.
- --pmap-dst-map-name=LABELS
- Similar to --pmap-src-map-name, but uses the
destination IP or the protocol and destination port.
- --pmap-any-map-name=LABELS
- If the prefix map associated with map-name is an IP prefix map,
this matches records with a source IP address or a destination IP address
that maps to a label contained in the list of labels in LABELS. If
the prefix map associated with map-name is a port/protocol prefix
map, this matches records with a protocol and source port or destination
port combination that maps to a label contained in the list of labels in
LABELS.
- --pmap-saddress=LABELS
- --pmap-daddress=LABELS
- --pmap-any-address=LABELS
- These are deprecated switches created by pmapfilter that correspond
to --pmap-src-map-name,
--pmap-dst-map-name, and
--pmap-any-map-name, respectively. These
switches are available when an IP prefix map is used that is not
associated with a map-name.
- --pmap-sport-proto=LABELS
- --pmap-dport-proto=LABELS
- --pmap-any-port-proto=LABELS
- These are deprecated switches created by pmapfilter that correspond
to --pmap-src-map-name,
--pmap-dst-map-name, and
--pmap-any-map-name, respectively. These
switches are available when a proto-port prefix map is used that is not
associated with a map-name.
- --scc=COUNTRY_CODE_LIST
- --dcc=COUNTRY_CODE_LIST
- --any-cc=COUNTRY_CODE_LIST
- Pass the record if one its IP addresses maps to a country code that is
specified in COUNTRY_CODE_LIST. For --scc, the source IP
must match. For --dcc, the destination IP must match. For
--any-cc, either the source or the destination must match.
COUNTRY_CODE_LIST is a comma separated list of lowercase two-letter
country codes---defined by ISO 3166-1 (see for example
<https://www.iso.org/iso-3166-country-codes.html> or
<https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2>)---as well as the
following special codes:
- "--"
- N/A (e.g. private and experimental reserved addresses)
- "a1"
- anonymous proxy
- "a2"
- satellite provider
- "o1"
- other
For example: "cx,uk,kr,jp,--".
To use this switch, the country code mapping file must be available in the
default location, or in the location specified by the SILK_COUNTRY_CODES
environment variable. See ccfilter(3) for details.
- --stype={0|1|2|3}
- --dtype={0|1|2|3}
- Pass a flow record depending on whether the IP address is internal,
external, or non-routable. These switches use the mapping file specified
by the SILK_ADDRESS_TYPES environment variable, or the
address_types.pmap mapping file, as described in
addrtype (3). When the parameter is 0, pass the
record if its source (--stype) IP address or destination
(--dtype) IP address is non-routable. When 1, pass if internal.
When 2, pass if external (i.e., routable but not internal). When 3, pass
if not internal (non-routable or external).
Partitioning Switches across Multiple Fields
The --tuple-* family of switches allows the user to
partition flow records based on multiple values of the five-tuple.
- --tuple-file=TUPLE_FILENAME
- This switch provides support for partitioning by arbitrary subsets of the
basic five-tuple:
{source-ip,destination-ip,source-port,destination-ip-port,protocol}
A SiLK Flow record passes the test when the record's fields
match one of the tuples; if the SiLK record does not match any tuple,
the record fails. The tuples are read from the text file
TUPLE_FILENAME which must contain lines of delimited fields. The
default delimiter is "|", but may be
specified with the --tuple-delimiter switch. Each field contains
one member of the tuple; the fields may appear in any order. The fields
may represent any subset of the five-tuple, but each line in the file
must define the same subset. A field that is present but has no value
generates an error. If you want the field to match any value, it is best
that you not include that field in your input.
In addition to the tuple-lines, TUPLE_FILENAME may
contain blank lines and comments (which begin with
"#" and continue to the end of the
line). The first line of TUPLE_FILENAME may contain a title
labeling the fields in the file. This title line is ignored when the
--tuple-fields switch is given.
The IP fields may contain an IPv4 address, an integer, or a IP
in CIDR block notation. Comma-separated lists
("80,443") and ranges
("0-1023,8080") are supported for the
ports and protocol fields. NOTE: Currently the code is not clever
in its support for CIDR notation and ranges in that each occurrence is
fully expanded. When this occurs, the memory required to hold the search
tree quickly grows.
- --tuple-fields=FIELDS
- FIELDS contains the list of fields (columns) to parse from the
TUPLE_FILENAME in the order in which they appear in the file. When
this switch is not provided, rwfilter treats the first line in
TUPLE_FILENAME as a title line and attempts to determine the fields
(a la rwtuc(1)); rwfilter exits if it cannot
determine the fields.
FIELDS is a comma separated list of field-names,
field-integers, and ranges of field-integers; a range is specified by
separating the start and end of the range with a hyphen (-).
Names can be abbreviated to their shortest unique prefix. The field
names and their descriptions are:
- sIP,sip,1
- source IP address
- dIP,dip,2
- destination IP address
- sPort,sport,3
- source port
- dPort,dport,4
- destination port
- protocol,5
- IP protocol
- --tuple-direction=DIRECTION
- Allows you to change the comparison between the tuple and the SiLK Flow
record. This switch allows one to look for traffic in the reverse
direction (or both directions) without having to write all of the rules
twice. The available directions are:
- forward
- The tuple's fields are compared against the corresponding fields on the
flow; that is, sIP is compared with sIP, dIP with dIP, sPort with sPort,
dPort with dPort, and protocol with protocol. This is the default.
- reverse
- The tuple's fields are compared against the opposite fields on the flow;
that is, sIP is compared with dIP, dIP with sIP, sPort with dPort, dPort
with sPort, and protocol with protocol.
- both
- Both of the above comparisons are performed.
- --tuple-delimiter=CHAR
- Specifies the character separating the input fields. When the switch is
not provided, the default of "|" is
used.
Partitioning Switches that use the PySiLK Plug-in
The SiLK Python plug-in provides support for filtering by
expressions or complex functions written in the Python programming language.
See the silkpython(3) and
pysilk(3) manual pages for information and examples for
how to use Python to manipulate SiLK data structures. When multiple
Partitioning Switches are given, the Python plug-in is the next-to-last to
be invoked. Only the code specified by the --plugin switch is called
after the Python code.
- --python-file=FILENAME
- Pass the record if the result of the processing the flow with the function
named rwfilter() in FILENAME is true. The
function should take a single silk.RWRec object as an argument. See
silkpython(3) for details.
- --python-expr=PYTHON_EXPRESSION
- Pass the record if the result of the processing the flow with the
specified PYTHON_EXPRESSION is true. The expression is evaluated as
if it appeared in the following context:
from silk import *
def rwfilter(rec):
return (PYTHON_EXPRESSION)
Partitioning Switches that use the IP-Association
Plug-In
The IPA plug-in, ipafilter.so, provides switches that can
partition flows using data in an IP Association database. For this plug-in
to be available, SiLK must be compiled with IPA support and IPA must be
configured. See ipafilter(3) and
<http://tools.netsa.cert.org/ipa/> for additional information.
- --ipa-src-expr=IPA_EXPR
- Use IPA_EXPR to partition flows based on the source IP of the flow
matching the IPA_EXPR expression.
- --ipa-dst-expr=IPA_EXPR
- Use IPA_EXPR to partition flows based on the destination IP of the
flow matching the IPA_EXPR expression.
- --ipa-any-expr=IPA_EXPR
- Use IPA_EXPR to partition flows based on either the source or
destination IP of the flow matching the IPA_EXPR expression.
- --compression-method=COMP_METHOD
- Specify the compression library to use when writing output files. If this
switch is not given, the value in the SILK_COMPRESSION_METHOD environment
variable is used if the value names an available compression method. When
no compression method is specified, output to the standard output or to
named pipes is not compressed, and output to files is compressed using the
default chosen when SiLK was compiled. The valid values for
COMP_METHOD are determined by which external libraries were found
when SiLK was compiled. To see the available compression methods and the
default method, use the --help or --version switch. SiLK can
support the following COMP_METHOD values when the required
libraries are available.
- none
- Do not compress the output using an external library.
- zlib
- Use the zlib(3) library for compressing the output,
and always compress the output regardless of the destination. Using zlib
produces the smallest output files at the cost of speed.
- lzo1x
- Use the lzo1x algorithm from the LZO real time compression library
for compression, and always compress the output regardless of the
destination. This compression provides good compression with less memory
and CPU overhead.
- snappy
- Use the snappy library for compression, and always compress the
output regardless of the destination. This compression provides good
compression with less memory and CPU overhead. Since SiLK
3.13.0.
- best
- Use lzo1x if available, otherwise use snappy if available, otherwise use
zlib if available. Only compress the output when writing to a file.
- --dry-run
- Perform a sanity check on the input arguments to check that the arguments
are acceptable. In addition, prints to the standard output the names of
the files that would be accessed (and the names of missing files if
--print-missing is specified). rwfglob(1) can
also be used to generate the lists of files that rwfilter would
access.
- --help
- Print the available options and exit. Options that add fields (for
example, options that load plug-ins, prefix maps, or PySiLK extensions)
can be specified before the --help switch so that the new options
appear in the output. The available classes and types are included in
output; you may specify a different root directory or site configuration
file before --help to see the classes and types available for that
site.
- --max-fail-records=N
- Write N records to each --fail-destination. rwfilter
stops reading input once it has written these N records unless
--pass-destination or --all-destination switch(es) are also
specified.
- --max-pass-records=N
- Write N records to each --pass-destination. rwfilter
stops reading input once it has written these N records unless
--fail-destination or --all-destination switch(es) are also
specified.
- --note-add=TEXT
- Add the specified TEXT to the header of the output file as an
annotation. This switch may be repeated to add multiple annotations to a
file. To view the annotations, use the rwfileinfo(1)
tool.
- --note-file-add=FILENAME
- Open FILENAME and add the contents of that file to the header of
the output file as an annotation. This switch may be repeated to add
multiple annotations. Currently the application makes no effort to ensure
that FILENAME contains text; be careful that you do not attempt to
add a SiLK data file as an annotation.
- --plugin=PLUGIN
- Augment the partitioning switches by using run-time loading of the plug-in
(shared object) whose path is PLUGIN. The switch may be repeated to
load multiple plug-ins. The creation of plug-ins is described in the
silk-plugin(3) manual page. When multiple
partitioning switches are given, the code specified by the --plugin
switch(es) is last to be invoked. When PLUGIN does not contain a
slash ("/"), rwfilter attempts to
find a file named PLUGIN in the directories listed in the
"FILES" section. If rwfilter finds the file, it uses that
path. If PLUGIN contains a slash or if rwfilter does not
find the file, rwfilter relies on your operating system's
dlopen(3) call to find the file. When the
SILK_PLUGIN_DEBUG environment variable is non-empty, rwfilter
prints status messages to the standard error as it attempts to find and
open each of its plug-ins.
- --print-filenames
- Print the names of input files as they are read. This can be useful
feedback for a long-running rwfilter process.
- --site-config-file=FILENAME
- Read the SiLK site configuration from the named file FILENAME. When
this switch is not provided, rwfilter searches for the site
configuration file in the locations specified in the "FILES"
section.
- --threads=N
- Invoke rwfilter with N threads reading the input files. When
this switch is not provided, the value in the SILK_RWFILTER_THREADS
environment variable is used. If that variable is not set, rwfilter
runs with a single thread. Using multiple threads, performance of
rwfilter is greatly improved for queries that look at many files
but return few records. Preliminary testing has found that performance
peaks around four threads per CPU, but performance varies depending on the
type of query and the number of records returned.
- --version
- Print the version number and information about how SiLK was configured,
then exit the application.
In the following examples, the dollar sign
("$") represents the shell prompt. The text
after the dollar sign represents the command line. Lines have been wrapped for
improved readability, and the back slash
("\") is used to indicate a wrapped line.
The most basic filtering involves looking at specific traffic over
a specific time. For example:
$ rwfilter --start-date=2003/02/19:00 --end-date=2003/02/19:23 \
--proto=6 --pass-destination=tcp-in.rw
creates a file, tcp-in.rw containing all incoming
TCP traffic on February 19, 2003. The --start-date and
--end-date switches select which files to examine. The --proto
switch partitions the flow records into a pass stream (records whose
protocol is 6---that is, TCP) and a fail stream (all other records).
The --pass-destination switch (often shortened to --pass)
tells rwfilter to write the records that pass the --proto test
to the file tcp-in.rw.
The tcp-in.rw file contains SiLK Flow data in a binary
format. To examine the contents, use the command
rwcut(1). This query only selects incoming traffic
because the silk.conf(5) configuration file at most
sites tells rwfilter to look at incoming traffic unless an explicit
--type switch is given.
The following query gets all TCP traffic (for the default class)
for February 19, 2003.
$ rwfilter --type=all --start-date=2003/02/19 \
--proto=6 --pass-destination=alltcp.rw
Note the addition of --type=all. This query also relies on
the default behavior of --start-date to consider a full day's worth
of data when no hour is specified.
The above query gets all traffic for the default class. If your
silk.conf file has a single class, that query captures all of it. For
silk.conf files that specify multiple classes, the following gets all
TCP traffic for February 19, 2003:
$ rwfilter --flowtypes=all/all --start-date=2003/02/19 \
--proto=6 --pass-destination=alltcp.rw
To get all non-TCP traffic, there are two approaches.
rwfilter does not supply a way to choose a negated set of protocols,
but you can choose all protocols other than TCP:
$ rwfilter --start-date=2003/02/19:00 --end-date=2003/02/19:23 \
--proto=0-5,7-255 --pass-destination=non-tcp.rw
The other approach is to use the --fail-destination switch
(often shortened to --fail) that contains the records that failed one
or more of the partitioning test(s):
$ rwfilter --start-date=2003/02/19:00 --end-date=2003/02/19:23 \
--proto=6 --fail-destination=non-tcp.rw
To print information about the number of flow records that pass a
filter, use --print-volume-statistics. This can be combined with
other output switches.
$ rwfilter --start-date=2003/02/19:00 --end-date=2003/02/19:23 \
--proto=6 --print-volume-stat --pass-destination=tcp-in.rw
| Recs| Packets| Bytes| Files|
Total| 515359| 2722887| 1343819719| 180|
Pass| 512071| 2706571| 1342851708| |
Fail| 3288| 16316| 968011| |
If you want to see the number of records in a file produced by
rwfilter, or to remind yourself how a file was created, use
rwfileinfo(1):
$ rwfileinfo tcp-in.rw
tcp-in.rw:
format(id) FT_RWGENERIC(0x16)
version 16
byte-order littleEndian
compression(id) lzo1x(2)
header-length 208
record-length 52
record-version 5
silk-version 2.4.0
count-records 512071
file-size 8576160
command-lines
1 rwfilter --start-date=2003/02/19:00 --end-date=2003/02/19:23 \
--proto=6 --print-volume-stat --pass-destination=tcp-in.rw
Once a file is written, rwfilter can process the file
again. Traffic on port 25 is most likely email (SMTP) traffic. To split the
email traffic from the other traffic, use:
$ rwfilter --aport=25 --pass=mail.rw --fail=not-mail.rw tcp-in.rw
This command puts traffic where the source or destination port was
25 into the file mail.rw, and all other traffic into the file
not-mail.rw. The --fail-destination is an effective way to
reverse the sense of a test. For example, to remove traffic on port 80 from
the not-mail.rw file, run the command:
$ rwfilter --aport=80 --fail=not-mail-web.rw not-mail.rw
To verify that the not-mail-web.rw file does not contain
any traffic on ports 25 or 80, you can use the --print-statistics
switch and see that 0 records pass:
$ rwfilter --aport=25,80 --print-stat not-mail-web.rw
Files 1. Read 54641. Pass 0. Fail 54641.
The file maintains a history of the commands that created it:
$ rwfileinfo not-mail-web.rw
not-mail-web.rw:
format(id) FT_RWGENERIC(0x16)
version 16
byte-order littleEndian
compression(id) lzo1x(2)
header-length 364
record-length 52
record-version 5
silk-version 2.4.0
count-records 54641
file-size 762875
command-lines
1 rwfilter --start-date=2003/02/19:00 --end-date=2003/02/19:23 \
--proto=6 --print-volume-stat --pass-destination=tcp-in.rw
2 rwfilter --aport=25 --pass=mail.rw --fail=not-mail.rw \
tcp-in.rw
3 rwfilter --aport=80 --fail=not-mail-web.rw not-mail.rw
The following finds all outgoing traffic from February 19, 2003,
going to an external email server. Traffic going to a server contacts that
server on its well-known port, and the flow record's destination port should
hold that well-known port:
$ rwfilter --type=out --start-date=2003/02/19 --print-volume-stat \
--dport=25 --proto=6
To limit the result to completed connections, select flow records
that contain at least three packets, use the --packets switch with an
open-ended range:
$ rwfilter --type=out --start-date=2003/02/19 --print-volume-stat \
--dport=25 --proto=6 --packets=3-
To limit the search to a particular internal CIDR block,
10.1.2.0/24, there are three different IP-partitioning switches you can use.
The final approach uses rwsetbuild(1) to create an
IPset file from textual input.
$ rwfilter --type=out --start-date=2003/02/19 --print-volume-stat \
--dport=25 --proto=6 --packets=3- --scidr=10.1.2.0/24
$ rwfilter --type=out --start-date=2003/02/19 --print-volume-stat \
--dport=25 --proto=6 --packets=3- --saddress=10.1.2.x
$ echo "10.1.2.0/24" | rwsetbuild > my-set.set
$ rwfilter --type=out --start-date=2003/02/19 --print-volume-stat \
--dport=25 --proto=6 --packets=3- --sipset=my-set.set
rwfilter does not have to output its records to a file;
instead, the output from rwfilter can be piped into a another SiLK
tool. You must still use the --pass-destination switch (or
--fail-destination or --all-destination switch), but by
providing the argument of "stdout" or
"-" to the switch you tell rwfilter
to write its output to the standard output.
For example, to get the IPs of the external email servers that the
monitored network contacted, pipe the rwfilter output into
rwset(1), and tell rwset to store the
destination addresses:
$ rwfilter --type=out --start-date=2003/02/19 --dport=25 \
--proto=6 --packets=3- --scidr=10.1.2.0/24 --pass=stdout \
| rwset --dip-file=external-mail-servers.set
rwfilter can also pipe its output as input to another
rwfilter command, which allows them to be chained together.
rwfilter does not read from the standard input by default; you must
explicitly give "stdin" or
"-" as the stream to read:
$ rwfilter --type=out,outweb --start-date=2003/02/19 \
--scidr=10.1.2.0/24 --pass=stdout \
| rwfilter --proto=17 --pass=udp.rw --fail=stdout stdin \
| rwfilter --proto=6 --pass=stdout --fail=non-tcp-udp.rw stdin \
| rwfilter --aport=25 --pass=mail.rw --fail=stdout stdin \
| rwfilter --aport=80,443 --pass=web.rw \
--fail=tcp-non-web-mail.rw stdin
This chain of commands looks at outgoing traffic on February 19,
2003, originating from the internal net-block 10.1.2.0/24, creates the
following files:
- udp.rw
- Outgoing UDP traffic
- non-tcp-udp.rw
- Outgoing traffic that is neither TCP nor UDP
- mail.rw
- Outgoing TCP traffic on port 25, most of which is probably email (SMTP).
Since the query looks at outgoing traffic and the --aport switch
was used, this file represents email going from the internal 10.1.2.0/24
to external mail servers, and the responses from any internal mail servers
that exist in the 10.1.2.0/24 net-block to external clients.
- web.rw
- Outgoing TCP traffic on ports 80 and 443, most of which is probably web
traffic (HTTP,HTTPS). As with the mail.rw file, this file
represents queries to external web servers and responses from internal web
servers.
- tcp-non-web-mail.rw
- Outgoing TCP traffic other than that on ports 25, 80, and 443
Expert users can create even more complicated chains of
rwfilter commands using named pipes.
- SILK_RWFILTER_THREADS
- The number of threads to use while reading input files or files selected
from the data store.
- PYTHONPATH
- This environment variable is used by Python to locate modules. When
--python-file or --python-expr is specified, rwfilter
must load the Python files that comprise the PySiLK module, such as
silk/__init__.py. If this silk/ directory is located outside
Python's normal search path (for example, in the SiLK installation tree),
it may be necessary to set or modify the PYTHONPATH environment variable
to include the parent directory of silk/ so that Python can find
the PySiLK module.
- SILK_PYTHON_TRACEBACK
- When set, Python plug-ins output traceback information on Python errors to
the standard error.
- SILK_COUNTRY_CODES
- This environment variable allows the user to specify the country code
mapping file that the --scc and --dcc switches use. The
value may be a complete path or a file relative to the SILK_PATH. See the
"FILES" section for standard locations of this file.
- SILK_ADDRESS_TYPES
- This environment variable allows the user to specify the address type
mapping file that the --stype and --dtype switches use. The
value may be a complete path or a file relative to the SILK_PATH. See the
"FILES" section for standard locations of this file.
- SILK_CLOBBER
- The SiLK tools normally refuse to overwrite existing files. Setting
SILK_CLOBBER to a non-empty value removes this restriction.
- SILK_COMPRESSION_METHOD
- This environment variable is used as the value for
--compression-method when that switch is not provided. Since
SiLK 3.13.0.
- SILK_CONFIG_FILE
- This environment variable is used as the value for the
--site-config-file when that switch is not provided.
- SILK_DATA_ROOTDIR
- This environment variable specifies the root directory of data repository.
This value overrides the compiled-in value, and rwfilter uses it
unless the --data-rootdir switch is specified. In addition,
rwfilter may use this value when searching for the SiLK site
configuration files. See the "FILES" section for details.
- SILK_PATH
- This environment variable gives the root of the install tree. When
searching for configuration files and plug-ins, rwfilter may use
this environment variable. See the "FILES" section for
details.
- TZ
- When a SiLK installation is built to use the local timezone (to determine
if this is the case, check the "Timezone
support" value in the output from rwfilter --version),
the value of the TZ environment variable determines the timezone in which
rwfilter parses timestamps. If the TZ environment variable is not
set, the default timezone is used. Setting TZ to 0 or the empty string
causes timestamps to be parsed as UTC. The value of the TZ environment
variable is ignored when the SiLK installation uses utc. For system
information on the TZ variable, see tzset(3) or
environ(7).
- SILK_PLUGIN_DEBUG
- When set to 1, rwfilter prints status messages to the standard
error as it attempts to find and open each of its plug-ins.
- SILK_LOGSTATS
- When set to a non-empty value, rwfilter treats the value as the
path to an external program to execute with information about this
rwfilter invocation. If the value in SILK_LOGSTATS does not contain
a slash or if it references a file that does not exist, is not a regular
file, or is not executable, the SILK_LOGSTATS value is silently ignored.
The arguments to the external program are:
- The application name, i.e., "rwfilter".
Note that "rwfilter" is always used as
this argument, regardless of the name of the executable.
- The version number of this command line, currently
"v0001".
- The start time of this invocation, as seconds since the UNIX epoch.
- The end time of this invocation, as seconds since the UNIX epoch.
- The number of data files opened for reading.
- The number of records read.
- The number of records written.
- A variable number of arguments that are the complete command line used to
invoke rwfilter, including the name of the executable.
- SILK_LOGSTATS_RWFILTER
- If set, this environment variable overrides the value specified in
SILK_LOGSTATS.
- SILK_LOGSTATS_DEBUG
- If the environment variable is set to a non-empty value, rwfilter
prints messages to the standard error about the SILK_LOGSTATS value being
used and either the reason why the value cannot be used or the arguments
to the external program being executed.
- ${SILK_ADDRESS_TYPES}
- ${SILK_PATH}/share/silk/address_types.pmap
- ${SILK_PATH}/share/address_types.pmap
- /usr/local/share/silk/address_types.pmap
- /usr/local/share/address_types.pmap
- Possible locations for the address types mapping file required by the
--stype and --dtype switches.
- ${SILK_CONFIG_FILE}
- ROOT_DIRECTORY/silk.conf
- ${SILK_PATH}/share/silk/silk.conf
- ${SILK_PATH}/share/silk.conf
- /usr/local/share/silk/silk.conf
- /usr/local/share/silk.conf
- Possible locations for the SiLK site configuration file which are checked
when the --site-config-file switch is not provided, where
ROOT_DIRECTORY/ is the directory rwfilter is using as the
root of the data repository.
- ${SILK_COUNTRY_CODES}
- ${SILK_PATH}/share/silk/country_codes.pmap
- ${SILK_PATH}/share/country_codes.pmap
- /usr/local/share/silk/country_codes.pmap
- /usr/local/share/country_codes.pmap
- Possible locations for the country code mapping file required by the
--scc and --dcc switches.
- ${SILK_DATA_ROOTDIR}/
- /data/
- Locations for the root directory of the data repository when the
--data-rootdir switch is not specified.
- ${SILK_PATH}/lib64/silk/
- ${SILK_PATH}/lib64/
- ${SILK_PATH}/lib/silk/
- ${SILK_PATH}/lib/
- /usr/local/lib64/silk/
- /usr/local/lib64/
- /usr/local/lib/silk/
- /usr/local/lib/
- Directories that rwfilter checks when attempting to load a
plug-in.
rwfilter is the most commonly used application in the suite. It provides
access to the data files and performs all the basic queries.
rwfilter supports a variety of I/O options - in addition to
reading from the data store, rwfilter results can be chained together
with named pipes to output results to multiple files simultaneously. An
introduction to named pipes is outside the scope of this document,
however.
Two often underused options are --dry-run and
--print-statistics. --dry-run performs a sanity check on the
arguments and can be used, especially for complicated arguments, to check
that the arguments are acceptable. --print-statistics used without
--pass-destination or --fail-destination simply prints
aggregate statistics to the standard error on a single line, and it can be
used to do a quick pass through the data to get aggregate counts before
going in deeper into the phenomenon being investigated.
--print-filename can be used as a progress meter; during
long jobs, it shows which file is currently being read by rwfilter.
--print-filename does not provide meaningful feedback with piped
input.
Filters are applied in the order given on the command line. It is
best to apply the biggest filters first.
The rwfilter command line is written into the header of the
output file(s). You may use the rwfileinfo(1) command
to see this information.
rwcut(1), rwfglob(1),
rwfileinfo(1), rwset(1),
rwtuc(1), rwsetbuild(1),
rwsiteinfo(1), rwpmapbuild(1),
addrtype(3), ccfilter(3),
flowrate(3), ipafilter(3),
pmapfilter(3), pysilk(3),
silkpython(3), silk-plugin(3),
silk.conf(5), sensor.conf(5),
silk(7), rwflowpack(8),
yaf(1), applabel(1),
zlib (3), dlopen(3),
tzset (3), environ(7), Analysts'
Handbook: Using SiLK for Network Traffic Analysis
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |