|
|
| |
rwfileinfo(1) |
SiLK Tool Suite |
rwfileinfo(1) |
rwfileinfo - Print information about a SiLK file
rwfileinfo [--fields=FIELDS] [--summary] [--no-titles]
[--site-config-file=FILENAME]
{--xargs | --xargs=FILENAME | FILE [FILE...]}
rwfileinfo --help
rwfileinfo --help-fields
rwfileinfo --version
rwfileinfo prints information about a binary SiLK file that can be
determined by reading the file's header and by moving quickly over the data
blocks in the file.
rwfileinfo requires one or more filename arguments to be
given on the command line or the use of the --xargs switch. When the
--xargs switch is provided, rwfileinfo reads the names of the
files to process from the named text file or from the standard input if no
file name argument is provided to the switch. The input to --xargs
must contain one file name per line. rwfileinfo does not read a SiLK
file's content from the standard input by default, but it does when either
"-" or
"stdin" is given as a filename
argument.
When the --summary switch is given, rwfileinfo first
prints the information for each individual file and then prints the number
of files processed, the sum of the individual file sizes, and the sum of the
individual record counts.
By default, rwfileinfo prints the following information for each file
argument. Use the --fields switch to modify which pieces of information
are printed.
(rwfileinfo prints each field in the order in which support
for that field was added to SiLK. The field descriptions are presented here
in a more logical order.)
- file-size
- The size of the file on disk as reported by the operating system.
rwfileinfo prints 0 for the file-size when reading from the
standard input.
- version
- Every binary file written by SiLK has a version number field. Since SiLK
1.0.0, the version number field has been used to indicate the general
structure (or layout) of the file. The file structure adopted in SiLK
1.0.0 uses a version number of 16 and has a header section and a
data section. The header section begins with 16 bytes that specify
well-defined values, and those bytes are followed by one or more
variably-sized header entries. The specifics of the data section
depend on the content of the file.
- header-length
- The header-length field shows the number of octets required by header
(i.e., the initial 16 bytes and the header entries). Since everything
after the header is data, the header-length is the starting offset of the
data section. The smallest header length is 24 bytes, but typically the
header is padded to be an integer multiple of the record-length. The
header-length that rwfileinfo prints for a file is determined
dynamically by reading the file's header.
- silk-version
- When a SiLK tool creates a binary file, the tool writes the current SiLK
release number (such as 3.9.0) into the file's header as a way to help
diagnose issues should a bug with a particular release of SiLK be
discovered in the future.
- byte-order
- Every SiLK file has a byte-order or endian field. SiLK uses the
machine's native representation of integers when writing data, and this
field shows what representation the file contains.
"BigEndian" is network byte order and
"littleEndian" is used by Intel chips.
The rwswapbytes(1) tool changes a file's integer
representation, and some tools have a --byte-order switch that
allows the user to specify the integer representation of output files. The
header-section of a file is always written in network byte order.
- compression
- SiLK tools may use the zlib library (<http://zlib.net/>), the LZO
library (<http://www.oberhumer.com/opensource/lzo/>), or the snappy
library (<http://google.github.io/snappy/>) to compress the data
section of a file. The compression field specifies which library (if any)
was used to compress the data section. If a file is compressed with a
library that was not included in an installation of SiLK, SiLK is unable
to read the data section of the file. Many SiLK tools accept the
--compression-method switch to choose a particular compression
method. (The compression field does not indicate whether the entire file
has been compressed with an external compression utility such as
gzip(1).)
- format
- Every binary file written by SiLK has two fields in the header that
specify exactly what the file contains: the format and the record-version.
In general, the format indicates the content type of the file and
the record-version indicates the evolution of that content.
The contents of a file whose format is
"FT_IPSET",
"FT_RWBAG", or
"FT_PREFIXMAP" is fairly obvious (an
IPset, a Bag, a prefix map).
There are many different file formats for writing SiLK Flow
records, but the SiLK analysis tools largely use a single Flow file
format. That format is
"FT_RWIPV6ROUTING" if SiLK has been
compiled with IPv6 support, or
"FT_RWGENERIC" otherwise. A file that
uses the "FT_RWGENERIC" format is only
capable of holding IPv4 addresses.
The other SiLK Flow file formats are created by
rwflowpack(8) as it writes flow records to the
repository. These formats often omit fields and use reduced bit-sizes
for fields to reduce the space required for an individual flow
record.
The record-version field indicates changes within the general
type specified by the format field. For example, SiLK incremented the
record-version of the formats that hold flow records when the resolution
of record timestamps was changed from seconds to milliseconds.
- record-version
- Together with the format fields specifies the contents of the file.
See the discussion of format for details.
- record-length
- Files created by SiLK 1.0.0 and later have a record length field. This
field contains the length of an individual record, and this value is
dependent on the format and record-version fields described above. Some
files (such as those containing IPsets or prefix maps) do not write
individual records to the output, and the record length is 1 for these
files.
- count-records
- The count-records field is generated dynamically by determining the length
the data section would require if it were completely uncompressed and
dividing it by the record-length. When the record-length is 1 (such as for
IPset files), the count-records field does not provide much information
beyond the length of the uncompressed data. For an uncompressed file,
adding header-length to the product of count-records and record-length is
equal to the file-size.
The fields given above are either present in the well-defined
header or are computed by reading the file.
The following fields are generated by reading the header entries
and determining if one or more header entries of the specified type are
present. The field is not printed in the output when the header entry is not
present in the file.
- command-lines
- Many of the SiLK tools write a header entry to the output file that
contains the command line invocation used to create that file, and some of
the SiLK tools also copy the command line history from their input files
to the output file. (The --invocation-strip switch on the tools can
be used to prevent copying and recording of the invocation.) The command
lines are stored in individual header entries and this field displays
those entries with the most recent invocation at the end of the list.
The command line history is has a couple of issues:
- When multiple input files are used to create a single output, the entries
are stored as a list, and this makes it is difficult to know which set of
command line entries are associated with which input file.
- When a SiLK tool creates multiple output files (e.g., when using both
--pass and --fail to rwfilter(1)), the
tool writes the same command line entry to each output file. Some context
in addition to the command line history may be needed to know which branch
of that tool a particular file represents.
- annotations
- Most of SiLK tools that create binary output files provide the
--note-add and --note-file-add switches which allow an
arbitrary annotation to be added to the header of a file. Some tools also
copy the annotations from the source files to the destination files. The
annotations are stored in individual header entries and this field
displays those entries.
- ipset
- The IPset writing tools (rwset(1),
rwsetbuild(1), rwsettool(1),
rwaggbagtool(1), and
rwbagtool(1)) support the following output formats
for IPset data structures:
- 2
- May hold only IPv4 addresses and does not have an ipset header entry.
- 3
- May hold IPv4 or IPv6 addresses and is readable by SiLK 3.0 and
later. It contains a header entry that describes the IPset data structure,
and the entry specifies the number of nodes, the number of branches from
each node, the number of leaves, the size of the nodes and leaves, and
which node is the root of the tree.
- 4
- May hold IPv4 or IPv6 addresses and is readable by SiLK 3.7 and
later. The file's header entry specifies whether the file contains IPv4
addresses or IPv6 addresses.
- 5
- May hold only IPv6 addresses and is readable by SiLK 3.14 and
later. The header entry specifies that the file contains IPv6 data.
- bag
- Since SiLK 3.0.0, the tools that write binary Bag files
(rwbag (1), rwbagbuild(1), and
rwbagtool(1)) have written a header entry that
specifies the type and size of the key and of the counter in the
file.
- aggregate-bag
- The tools rwaggbag(1),
rwaggbagbuild(1), and
rwaggbagtool(1) write a header entry that contains
the field types that comprise the key and the counter.
- prefix-map
- When using rwpmapbuild(1) to create a prefix map
file, a string that specifies a mapname may be provided.
rwpmapbuild writes the mapname to a header entry in the prefix map
file. The mapname is used to generate command line switches or field names
when the --pmap-file switch is specified to several of the SiLK
tools (see pmapfilter(3) for details). When
displaying the mapname, rwfileinfo prefixes it with the string
"v1:" which denotes a version number for
the prefix-map header entry. (The version number is printed for
completeness.)
- packed-file-info
- When rwflowpack(8) creates a SiLK Flow file for the
repository, all the records in the file have the same starting hour, the
same sensor, and the same flowtype (class/type pair). rwflowpack
writes a header entry to the file that contains these values, and this
field displays those values. (To print the names for the sensor and
flowtype, the silk.conf(5) file must be
accessible.)
- probe-name
- When flowcap(8) creates a SiLK flow file, it adds a
header entry specifying the name of the probe from which the data was
collected.
Option names may be abbreviated if the abbreviation is unique or is an exact
match for an option. A parameter to an option may be specified as
--arg=param or --arg param, though the first form
is required for options that take optional parameters.
- --fields=FIELDS
- Specify what information to print for each file argument on the command
line. FIELDS is a comma separated list of field-names,
field-integers, and ranges of field-integers; a range is specified by
separating the start and end of the range with a hyphen (-).
Field-names are case-insensitive and may be shortened to a unique prefix.
When the --fields option is not given, all fields are printed if
the file contains the necessary information. The fields are always printed
in the order they appear here regardless of the order they are specified
in FIELDS.
The possible field values are given next with a brief
description of each. For a full description of each field, see
"Field Descriptions" above.
- format,1
- The contents of the file as a name and the corresponding hexadecimal
ID.
- version,2
- An integer describing the layout or structure of the file.
- byte-order,3
- Either "BigEndian" or
"littleEndian" to indicate the
representation used to store integers in the file (network or non-network
byte order).
- compression,4
- The compression library (if any) used to compress the data-section of the
file, specified as a name and its decimal ID.
- header-length,5
- The octet length of the file's header; alternatively the offset where data
begins.
- record-length,6
- The octet length of a single record or the value 1 if the file's content
is not record-based.
- count-records,7
- The number of records in the file, computed by dividing the uncompressed
data length by the record-length.
- file-size,8
- The size of the file on disk as reported by the operating system.
- command-lines,9
- The command line invocation used to generate this file.
- record-version,10
- The version of the records contained in the file.
- silk-version,11
- The release of SiLK that wrote this file.
- packed-file-info,12
- For a repository Flow file generated by
rwflowpack(8), this prints the timestamp of the
starting hour, the flowtype, and the sensor of each flow record in the
file.
- probe,13
- For a Flow file generated by flowcap(8), the name of
the probe where the flow records where initially collected.
- annotations,14
- The notes (annotations) that users have added to the file's header.
- prefix-map,15
- For a prefix map file, the "mapname"
that was set when the file was created by
rwpmapbuild(1).
- ipset,16
- For an IPset file whose record-version is 3, a description of the tree
data structure. For an IPset file whose record-version is 4, the type of
IP addresses (IPv4 or IPv6).
- bag,17
- For a bag file, the type and size of the key and of the counter.
- aggregate-bag,18
- For an aggregate bag file, the field types that comprise the key and the
counter.
- --summary
- After the data for each individual file is printed, print a summary that
shows the number of files processed, the sum of the individual file sizes,
and the total number of records contained in those files.
- --no-titles
- Suppress printing of the file name and field names. The output contains
only the values, where each value is printed left-justified on a single
line.
- --site-config-file=FILENAME
- Read the SiLK site configuration from the named file FILENAME. When
this switch is not provided, rwfileinfo searches for the site
configuration file in the locations specified in the "FILES"
section.
- --xargs
- --xargs=FILENAME
- Read the names of the input files from FILENAME or from the
standard input if FILENAME is not provided. The input is expected
to have one filename per line. rwfileinfo opens each named file in
turn and prints its information as if the filenames had been listed on the
command line. Since SiLK 3.15.0.
- --help
- Print the available options and exit.
- --help-fields
- Print a description of each field, its alias, and exit.
- --version
- Print the version number and information about how SiLK was configured,
then exit the application.
In the following examples, the dollar sign
("$") represents the shell prompt. The text
after the dollar sign represents the command line.
Get information about the file tcp-data.rw:
$ rwfileinfo tcp-data.rw
tcp-data.rw:
format(id) FT_RWGENERIC(0x16)
version 16
byte-order littleEndian
compression(id) none(0)
header-length 208
record-length 52
record-version 5
silk-version 1.0.1
count-records 7
file-size 572
command-lines
1 rwfilter --proto=6 --pass=tcp-data.rw ...
annotations
1 This is some interesting TCP data
Return a single value which is the number of records in the file
tcp-data.rw:
$ rwfileinfo --no-titles --field=count-records tcp-data.rw
7
- SILK_CONFIG_FILE
- This environment variable is used as the value for the
--site-config-file when that switch is not provided.
- SILK_DATA_ROOTDIR
- This environment variable specifies the root directory of data repository.
As described in the "FILES" section, rwfileinfo may use
this environment variable when searching for the SiLK site configuration
file.
- SILK_PATH
- This environment variable gives the root of the install tree. When
searching for configuration files, rwfileinfo may use this
environment variable. See the "FILES" section for details.
- ${SILK_CONFIG_FILE}
- ${SILK_DATA_ROOTDIR}/silk.conf
- /data/silk.conf
- ${SILK_PATH}/share/silk/silk.conf
- ${SILK_PATH}/share/silk.conf
- /usr/local/share/silk/silk.conf
- /usr/local/share/silk.conf
- Possible locations for the SiLK site configuration file which are checked
when the --site-config-file switch is not provided.
rwfilter(1), rwaggbag(1),
rwaggbagbuild(1), rwaggbagtool(1),
rwbag(1), rwbagbuild(1),
rwbagtool(1), rwpmapbuild(1),
rwset(1), rwsetbuild(1),
rwsettool(1) rwswapbytes(1),
silk.conf(5), pmapfilter(3),
flowcap(8), rwflowpack(8),
silk(7), gzip(1)
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |