|
|
| |
rwaggbagbuild(1) |
SiLK Tool Suite |
rwaggbagbuild(1) |
rwaggbagbuild - Create a binary aggregate bag from non-flow data
rwaggbagbuild [--fields=FIELDS]
[--constant-field=FIELD=VALUE [--constant-field=FIELD=VALUE...]]
[--column-separator=CHAR] [--no-titles]
[--bad-input-lines=FILE] [--verbose] [--stop-on-error]
[--note-add=TEXT] [--note-file-add=FILE]
[--invocation-strip] [--compression-method=COMP_METHOD]
[--output-path=PATH] [--site-config-file=FILENAME]
{[--xargs] | [--xargs=FILENAME] | [FILE [FILE...]]}
rwaggbagbuild --help
rwaggbagbuild --version
rwaggbagbuild builds a binary Aggregate Bag file by reading one or more
files containing textual input. To build an Aggregate Bag from SiLK Flow
records, use rwaggbag(1).
An Aggregate Bag is a binary file that maps a key to a
counter, where the key and the counter are both composed of one or more
fields. For example, an Aggregate Bag could contain the sum of the packet
count and the sum of the byte count for each unique source IP and source
port pair.
rwaggbagbuild reads its input from the files named on the
command line or from the standard input when no file names are specified,
when --xargs is not present, and when the standard input is not a
terminal. To read the standard input in addition to the named files, use
"-" or
"stdin" as a file name. When the
--xargs switch is provided, rwaggbagbuild reads the names of
the files to process from the named text file or from the standard input if
no file name argument is provided to the switch. The input to --xargs
must contain one file name per line.
The new Aggregate Bag file is written to the location specified by
the --output-path switch. If it is not provided, output is sent to
the standard output when it is not connected to a terminal.
The Aggregate Bag file must have at least one field that it
considers and key field and at least one field that it considers a counter
field. See the description of the --fields switch.
In general (and as detailed below), each line of the text input
files becomes one entry in the Aggregate Bag file. It is also possible to
specify that each entry in the Aggregate Bag file contains additional
fields, each with a specific value. These fields are specified by the
--constant-field switch whose argument is a field name, an equals
sign ('"="'), and a textual representation
of a value. The named field becomes one of the key or counter fields in the
Aggregate Bag file, and that field is given the specified value for each
entry that is read from an input file. See the --fields switch in the
"OPTIONS" section for the names of the fields and the acceptable
forms of the textual input for each field.
The remainder of this section details how rwaggbagbuild
processes each text input file to create an Aggregate Bag file.
When the --fields switch is specified, its argument
specifies the key and counter fields that the new Aggregate Bag file is to
contain. If --fields is not specified, the first line of the first
input file is expected to contain field names, and those names determine the
Aggregate Bag's key and counter. A field name of
"ignore" causes rwaggbagbuild to
ignore the values in that field when parsing the input.
The textual input is processed one line at a time. Comments begin
with a '"#"'-character and continue to the
end of the line; they are stripped from each line. After removing the
comments, any line that is blank or contains only whitespace is ignored.
All other lines must contain valid input, which is a set of fields
separated by a delimiter. The default delimiter is the virtual bar
('"|"') and may be changed with the
--column-separator switch. Whitespace around a delimiter is allowed;
however, using space or tab as the separator causes each space or tab
character to be treated as a field delimiter. The newline character is not a
valid delimiter character since it is used to denote records, and
'"#"' is not a valid delimiter since it
begins a comment.
The first line of each input file may contain delimiter-separated
field names denoting in which order the fields appear in this input file. As
mentioned above, when the --fields switch is not given, the first
line of the first file determines the Aggregate Bag's key and counter. To
tell rwaggbagbuild to treat the first line of each file as field
values to be parsed, specify the --no-titles switch.
Every other line must contain delimiter-separated field values. A
delimiter may follow the final field on a line. rwaggbagbuild ignores
lines that contain either too few or too many fields.
See the description of the --fields switch in the
"OPTIONS" section for the names of the fields and the acceptable
forms of the textual input for each field.
Option names may be abbreviated if the abbreviation is unique or is an exact
match for an option. A parameter to an option may be specified as
--arg=param or --arg param, though the
first form is required for options that take optional parameters.
- --fields=FIELDS
- Specify the fields in the input files. FIELDS is a comma separated
list of field names. Field names are case-insensitive, and a name may be
abbreviated to the shortest unique prefix. Other than the
"ignore" field, a field name may not be
specified more than once. The Aggregate Bag file must have at least one
key field and at least one counter field.
The names of the fields that are considered key fields, their
descriptions, and the format of the input that each expects are:
- ignore
- field that rwaggbagbuild is to skip
- sIPv4
- source IP address, IPv4 only; either the canonical dotted-quad format or
an integer from 0 to 4294967295 inclusive
- dIPv4
- destination IP address, IPv4 only; uses the same format as
"sIPv4"
- nhIPv4
- next hop IP address, IPv4 only; uses the same format as
"sIPv4"
- any-IPv4
- a generic IPv4 address; uses the same format as
"sIPv4"
- sIPv6
- source IP address, IPv6 only; the canonical hex-encoded format for IPv6
addresses
- dIPv6
- destination IP address, IPv6 only; uses the same format as
"sIPv6"
- nhIPv6
- next hop IP address, IPv6 only; uses the same format as
"sIPv6"
- any-IPv6
- a generic IPv6 address; uses the same format as
"sIPv6"
- sPort
- source port; an integer from 0 to 65535 inclusive
- dPort
- destination port; an integer from 0 to 65535 inclusive
- any-port
- a generic port; an integer from 0 to 65535 inclusive
- protocol
- IP protocol; an integer from 0 to 255 inclusive
- packets
- packet count; an integer from 1 to 4294967295 inclusive
- bytes
- byte count; an integer from 1 to 4294967295 inclusive
- flags
- bit-wise OR of TCP flags over all packets; a string containing
"F",
"S",
"R",
"P",
"A",
"U",
"E",
"C" in upper- or lowercase
- initialFlags
- TCP flags on the first packet; uses the same form as
"flags"
- sessionFlags
- bit-wise OR of TCP flags on the second through final packet; uses the same
form as "flags"
- sTime
- starting time in seconds; uses the form
"YYYY/MM/DD[:hh[:mm[:ss[.sss]]]]" (any
milliseconds value is dropped). A "T"
may be used in place of ":" to separate
the day and hour fields. A floating point value between 536870912 and
2147483647 is also allowed and is treated as seconds since the UNIX
epoch.
- eTime
- ending time in seconds; uses the same format as
"sTime"
- any-time
- a generic time in seconds; uses the same format as
"sTime"
- duration
- duration of flow; a floating point value from 0.0 to 4294967.295
- sensor
- sensor name or ID at the collection point; a string as given in
silk.conf(5)
- class
- class at collection point; a string as given in silk.conf
- type
- type at collection point; a string as given in silk.conf
- input
- router SNMP ingress interface or vlanId; an integer from 0 to 65535
- output
- router SNMP egress interface or postVlanId; an integer from 0 to
65535
- any-snmp
- a generic SNMP value; an integer from 0 to 65535
- attribute
- flow attributes set by the flow generator:
- "S"
- all the packets in this flow record are exactly the same size
- "F"
- flow generator saw additional packets in this flow following a packet with
a FIN flag (excluding ACK packets)
- "T"
- flow generator prematurely created a record for a long-running connection
due to a timeout or a byte-count threshold
- "C"
- flow generator created a record as a continuation of a previous record for
a connection that exceeded a timeout or byte-count threshold
- application
- guess as to the content of the flow; as an integer from 0 to 65535
- icmpType
- ICMP type; an integer from 0 to 255 inclusive
- icmpCode
- ICMP code; an integer from 0 to 255 inclusive
- scc
- the country code of the source; accepts a two character string to use as
the country of the source IP. The code is not checked for validity
against the country_codes.pmap file. The code must be ASCII and it
may contain two letters, a letter followed by a number, or the string
"--". Since SiLK 3.19.0.
- dcc
- the country code of the destination. See
"scc". Since SiLK 3.19.0.
- any-cc
- a generic country code. See "scc".
Since SiLK 3.19.0.
- custom-key
- a generic key; an integer from 0 to 4294967295 inclusive
The names and descriptions of the fields that are considered
counter fields are listed next. For each, the type of input is an unsigned
64-bit number; that is, an integer from 0 to 18446744073709551615.
- records
- count of records that match the key
- sum-packets
- sum of packet counts
- sum-bytes
- sum of byte counts
- sum-duration
- sum of duration values
- custom-counter
- a generic counter
- --constant-field=FIELD=VALUE
- For each entry read from the input file(s), insert a field named
FIELD and set its value to VALUE. VALUE is a textual
representation of the field's value as described in the description of the
--fields switch above. When FIELD is a counter field and the
same key appears multiple times in the input, VALUE is added to the
counter multiple times. If a field named FIELD appears in an input
file, its value from that file is ignored. Specify the
--constant-field switch multiple times to insert multiple
fields.
- --column-separator=CHAR
- When reading textual input, use the character CHAR as the delimiter
between columns (fields) in the input. The default column separator is the
vertical pipe ('"|"').
rwaggbagbuild normally ignores whitespace (space and tab) around
the column separator; however, using space or tab as the separator causes
each space or tab character to be treated as a field delimiter. The
newline character is not a valid delimiter character since it is used to
denote records, and '"#"' is not a valid
delimiter since it begins a comment.
- --bad-input-lines=FILEPATH
- When parsing textual input, copy any lines than cannot be parsed to
FILEPATH. The strings "stdout"
and "stderr" may be used for the
standard output and standard error, respectively. Each bad line is
prepended by the name of the source input file, a colon, the line number,
and a colon. On exit, rwaggbagbuild removes FILEPATH if all
input lines were successfully parsed.
- --verbose
- When a textual input line fails to parse, print a message to the standard
error describing the problem. When this switch is not specified, parsing
failures are not reported. rwaggbagbuild continues to process the
input after printing the message. To stop processing when a parsing error
occurs, use --stop-on-error.
- --stop-on-error
- When a textual input line fails to parse, print a message to the standard
error describing the problem and exit the program. When this occurs, the
output file contains any records successfully created prior to reading the
bad input line. The default behavior of rwaggbagbuild is to
silently ignore parsing errors. To report parsing errors and continue
processing the input, use --verbose.
- --no-titles
- Parse the first line of the input as field values. Normally when the
--fields switch is specified, rwaggbagbuild examines the
first line to determine if the line contains the names (titles) of fields
and skips the line if it does. rwaggbagbuild exits with an error
when --no-titles is given but --fields is not.
- --note-add=TEXT
- Add the specified TEXT to the header of the output file as an
annotation. This switch may be repeated to add multiple annotations to a
file. To view the annotations, use the rwfileinfo(1)
tool.
- --note-file-add=FILENAME
- Open FILENAME and add the contents of that file to the header of
the output file as an annotation. This switch may be repeated to add
multiple annotations. Currently the application makes no effort to ensure
that FILENAME contains text; be careful that you do not attempt to
add a SiLK data file as an annotation.
- --invocation-strip
- Do not record the command used to create the Aggregate Bag file in the
output. When this switch is not given, the invocation is written to the
file's header, and the invocation may be viewed with
rwfileinfo (1).
- --compression-method=COMP_METHOD
- Specify the compression library to use when writing output files. If this
switch is not given, the value in the SILK_COMPRESSION_METHOD environment
variable is used if the value names an available compression method. When
no compression method is specified, output to the standard output or to
named pipes is not compressed, and output to files is compressed using the
default chosen when SiLK was compiled. The valid values for
COMP_METHOD are determined by which external libraries were found
when SiLK was compiled. To see the available compression methods and the
default method, use the --help or --version switch. SiLK can
support the following COMP_METHOD values when the required
libraries are available.
- none
- Do not compress the output using an external library.
- zlib
- Use the zlib(3) library for compressing the output,
and always compress the output regardless of the destination. Using zlib
produces the smallest output files at the cost of speed.
- lzo1x
- Use the lzo1x algorithm from the LZO real time compression library
for compression, and always compress the output regardless of the
destination. This compression provides good compression with less memory
and CPU overhead.
- snappy
- Use the snappy library for compression, and always compress the
output regardless of the destination. This compression provides good
compression with less memory and CPU overhead.
- best
- Use lzo1x if available, otherwise use snappy if available, otherwise use
zlib if available. Only compress the output when writing to a file.
- --output-path=PATH
- Write the binary Aggregate Bag output to PATH, where PATH is
a filename, a named pipe, the keyword
"stderr" to write the output to the
standard error, or the keyword "stdout"
or "-" to write the output to the
standard output. If PATH names an existing file,
rwaggbagbuild exits with an error unless the SILK_CLOBBER
environment variable is set, in which case PATH is overwritten. If
this switch is not given, the output is written to the standard output.
Attempting to write the binary output to a terminal causes
rwaggbagbuild to exit with an error.
- --site-config-file=FILENAME
- Read the SiLK site configuration from the named file FILENAME. When
this switch is not provided, rwaggbagbuild searches for the site
configuration file in the locations specified in the "FILES"
section.
- --xargs
- --xargs=FILENAME
- Read the names of the input files from FILENAME or from the
standard input if FILENAME is not provided. The input is expected
to have one filename per line. rwaggbagbuild opens each named file
in turn and reads text from it as if the filenames had been listed on the
command line.
- --help
- Print the available options and exit.
- --version
- Print the version number and information about how SiLK was configured,
then exit the application.
In the following examples, the dollar sign
("$") represents the shell prompt. The text
after the dollar sign represents the command line. Lines have been wrapped for
improved readability, and the back slash
("\") is used to indicate a wrapped line.
Assume the following textual data in the file rec.txt:
dIP|dPort| packets| bytes|
10.245.15.175| 80| 127| 12862|
192.168.251.186|29222| 131| 351213|
10.247.186.130| 80| 596| 38941|
192.168.239.224|29362| 600| 404478|
192.168.215.219| 80| 400| 32375|
10.255.252.19|28925| 404| 1052274|
192.168.255.249| 80| 112| 7412|
10.208.7.238|29246| 109| 112977|
192.168.254.127| 80| 111| 9759|
10.218.34.108|29700| 114| 461845|
To create an Aggregate Bag file from this data, provide the
--fields switch with the names used by the Aggregate Bag tools:
$ rwaggbagbuild --fields=dipv4,dport,sum-packets,sum-bytes \
--output-path=ab.aggbag rec.txt
Use the rwaggbagcat(1) tool to view it:
$ rwaggbagcat ab.aggbag
dIPv4|dPort| sum-packets| sum-bytes|
10.208.7.238|29246| 109| 112977|
10.218.34.108|29700| 114| 461845|
10.245.15.175| 80| 127| 12862|
10.247.186.130| 80| 596| 38941|
10.255.252.19|28925| 404| 1052274|
192.168.215.219| 80| 400| 32375|
192.168.239.224|29362| 600| 404478|
192.168.251.186|29222| 131| 351213|
192.168.254.127| 80| 111| 9759|
192.168.255.249| 80| 112| 7412|
Create an Aggregate Bag from the destination port field and count
the number of times each port appears, ignore all fields except the
"dPort" fields and use
--constant-field to add a new field:
$ rwaggbagbuild --fields=ignore,dport,ignore,ignore \
--constant-field=record=1 \
| rwaggbagcat
dPort| records|
80| 5|
28925| 1|
29222| 1|
29246| 1|
29362| 1|
29700| 1|
Alternatively, use rwaggbagtool(1) to get the
same information from the ab.aggbag file created above:
$ rwaggbagtool --select-fields=dport \
--insert-field=record=1 ab.aggbag \
| rwaggbagcat
dPort| records|
80| 5|
28925| 1|
29222| 1|
29246| 1|
29362| 1|
29700| 1|
- SILK_CLOBBER
- The SiLK tools normally refuse to overwrite existing files. Setting
SILK_CLOBBER to a non-empty value removes this restriction.
- SILK_COMPRESSION_METHOD
- This environment variable is used as the value for
--compression-method when that switch is not provided.
- SILK_CONFIG_FILE
- This environment variable is used as the value for the
--site-config-file when that switch is not provided.
- SILK_DATA_ROOTDIR
- This environment variable specifies the root directory of data repository.
As described in the "FILES" section, rwaggbagbuild may
use this environment variable when searching for the SiLK site
configuration file.
- SILK_PATH
- This environment variable gives the root of the install tree. When
searching for configuration files, rwaggbagbuild may use this
environment variable. See the "FILES" section for details.
- ${SILK_CONFIG_FILE}
- ${SILK_DATA_ROOTDIR}/silk.conf
- /data/silk.conf
- ${SILK_PATH}/share/silk/silk.conf
- ${SILK_PATH}/share/silk.conf
- /usr/local/share/silk/silk.conf
- /usr/local/share/silk.conf
- Possible locations for the SiLK site configuration file which are checked
when the --site-config-file switch is not provided.
rwaggbag(1), rwaggbagcat(1),
rwaggbagtool(1), rwfileinfo(1),
rwset(1), rwsetbuild(1),
rwsetcat(1), rwsettool(1),
ccfilter(3), silk.conf(5),
silk(7), zlib(3)
rwaggbagbuild and the other Aggregate Bag tools were introduced in SiLK
3.15.0.
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |