|
|
| |
rwrecgenerator(1) |
SiLK Tool Suite |
rwrecgenerator(1) |
rwrecgenerator - Generate random SiLK Flow records
rwrecgenerator { --silk-output-path=PATH | --text-output-path=PATH
| { --output-directory=DIR_PATH
--processing-directory=DIR_PATH }}
--log-destination=DESTINATION [--log-level=LEVEL]
[--log-sysfacility=NUMBER] [--seed=SEED]
[--start-time=START_DATETIME --end-time=END_DATETIME]
[--time-step=MILLISECONDS] [--events-per-step=COUNT]
[--num-subprocesses=COUNT] [--flush-timeout=MILLISEC]
[--file-cache-size=SIZE] [--compression-method=COMP_METHOD]
[--timestamp-format=FORMAT] [--epoch-time]
[--ip-format=FORMAT] [--integer-ips] [--zero-pad-ips]
[--integer-sensors] [--integer-tcp-flags] [--no-titles]
[--no-columns] [--column-separator=CHAR]
[--no-final-delimiter] [--delimited=[CHAR]]]
[--site-config-file=FILENAME] [--sensor-prefix-map=FILE]
[--flowtype-in=CLASS/TYPE] [--flowtype-inweb=CLASS/TYPE]
[--flowtype-out=CLASS/TYPE] [--flowtype-outweb=CLASS/TYPE]
rwrecgenerator --help
rwrecgenerator --version
rwrecgenerator uses pseudo-random numbers to generate events,
where each consists of one or more SiLK Flow records. These flow records can
written as a single binary file, as text (in either a columnar or a comma
separated value format) similar to the output from
rwcut(1), or as a directory of small binary files to
mimic the incremental files produced by
rwflowpack(8). The type of output to produce must be
specified using the appropriate switches. Currently only one type of output
may be produced in a single invocation.
rwrecgenerator works through a time window, where the
starting and ending times for the window may be specified on the command
line. When not specified, the window defaults to the previous hour. By
default, rwrecgenerator will generate one event at the start time and
one event at the end time. To modify the size of the steps
rwrecgenerator takes across the window, specify the
--time-step switch. The number of events to create at each step may
be specified with the --events-per-step switch.
The time window specifies when the events begin. Since most events
create multiple flow records with small time offsets between them (and some
events may create flow records across multiple hours), flow records will
exist that begin after the time window.
To generate a single SiLK flow file, specify its location with the
--silk-output-path switch. A value of
"-" will write the output to the standard
output unless the standard output is connected to a terminal.
To produce textual output, specify --text-output-path.
rwrecgenerator has numerous switches to control the appearance of the
text; however, currently rwrecgenerator produces a fixed set of
fields.
When creating incremental files, the --output-directory and
--processing-directory switches are required. rwrecgenerator
creates files in the processing directory, and moves the files to the output
directory when the flush timeout arrives. The default flush timeout is
30,000 milliseconds (30 seconds); the user may modify the value with the
--flush-timeout switch. Any files in the processing directory are
removed when rwrecgenerator starts.
The --num-subprocesses switch tells rwrecgenerator
to use multiple subprocesses when creating incremental files. When the
switch is specified, rwrecgenerator will split the time window into
multiple pieces and give each subprocess its own time window to create. The
initial rwrecgenerator process then waits for the subprocesses to
complete. When --num-subprocesses is specified, rwrecgenerator
will create subdirectories under the --processing-directory, where
each subprocess gets its own processing directory.
The --seed switch may be specified to provide a consistent
set of flow records across multiple invocations. (Note that the names of the
incremental files will differ across invocations since those names are
created with the mkstemp(3) function.)
Given the same seed for the pseudo-random number generator and
assuming the --num-subprocesses is not specified, the output
from rwrecgenerator will contain the same data regardless of whether
the output is written to a single SiLK flow file, a text file, or a series
of incremental files.
When both --seed and --num-subprocesses is
specified, the incremental files will contain the same flow records across
invocations, but the flow records will not be consistent with those created
by --silk-output-path or --text-output-path.
rwrecgenerator must have access to a
silk.conf (5) site configuration file, either specified
by the --site-config-file switch on the command line or specified by
the typical methods.
The --flowtype-in, --flowtype-inweb,
--flowtype-out, and --flowtype-outweb switches may be used to
specify the flowtype (that is, the class/type pair)
that rwrecgenerator uses for its flow records. When these switches
are not specified, rwrecgenerator attempts to use the flowtypes
defined in the silk.conf file for the twoway site.
Specifically, it attempts to use "all/in", "all/inweb",
"all/out", and "all/outweb", respectively.
Use of the --sensor-prefix-map switch is recommended. The
argument should name a prefix map file that maps from an internal IP address
to a sensor number. If the switch is not provided, all flow records will use
the first sensor in the silk.conf file that is supported by the class
specified by the flowtypes. When using the --sensor-prefix-map, make
certain the sensors you choose are in the class specified in the
--flowtype-* switches.
When using the --sensor-prefix-map switch and creating
incremental files, it is recommended that you use the
--file-cache-size switch to increase the size of the stream cache to
be approximately 12 to 16 times the number of sensors. This will reduce the
amount of time spent closing and reopening the files.
The --log-destination switch is required. Specify
none to disable logging.
Currently, rwrecgenerator only supports generating IPv4
addresses. Addresses in 0.0.0.0/1 are considered internal, and addresses in
128.0.0.0/1 are considered external. All flow records are between an
internal and an external address. Whether the internal addresses is the
source or destination of the unidirectional flow record is determined
randomly.
The types of flow records that rwrecgenerator creates
are:
- HTTP traffic on port 80/tcp that consists of a query and a response. This
traffic will be about 30% of the total by flow count.
- HTTPS traffic on port 443/tcp that consists of a query and a response.
This traffic will be about 30% of the total by flow count.
- DNS traffic on port 53/udp that consists of a query and a response. This
traffic will be about 10% of the total by flow count.
- FTP traffic on port 21/tcp that consists of a query and a response. This
traffic will be about 4% of the total by flow count.
- ICMP traffic on that consists of a single message. This traffic will be
about 4% of the total by flow count.
- IMAP traffic on port 143/tcp that consists of a query and a response. This
traffic will be about 4% of the total by flow count.
- POP3 traffic on port 110/tcp that consists of a query and a response. This
traffic will be about 4% of the total by flow count.
- SMTP traffic on port 25/tcp that consists of a query and a response. This
traffic will be about 4% of the total by flow count.
- TELNET traffic on port 23/tcp between two machines. This traffic may
involve multiple flow records that reach the active timeout of 1800
seconds. This traffic will be about 4% of the total by flow count.
- Traffic on IP Protocols 47, 50, or 58 that consists of a single record.
This traffic will be about 4% of the total by flow count.
- Scans of every port on one IP address. This traffic will be about 1% of
the total by flow count.
- Scans of a single port across a range of IP addresses. This traffic will
be about 1% of the total by flow count.
Option names may be abbreviated if the abbreviation is unique or is an exact
match for an option. A parameter to an option may be specified as
--arg=param or --arg param, though the first form
is required for options that take optional parameters.
Exactly one of the following switches is required.
- --silk-output-path=PATH
- Tell rwrecgenerator to create a single binary file of SiLK flow
records at the specified location. If PATH is
"-", the records are written to the
standard output. rwrecgenerator does not support writing binary
data to a terminal.
- --output-directory=DIR_PATH
- Name the directory into which the incremental files are written once the
flush timeout is reached.
- --text-output-path=PATH
- Tell rwrecgenerator to convert the flow records it creates to text
and to print the result in a format similar to that created by
rwcut(1). The output is written to the specified
location. If PATH is "-", the
records are written to the standard output.
The --log-destination switch is required. Use a value of none to
disable logging.
- --log-destination=DESTINATION
- Specify the destination where logging messages are written. When
DESTINATION begins with a slash
"/", it is treated as a file system path
and all log messages are written to that file; there is no log rotation.
When DESTINATION does not begin with
"/", it must be one of the following
strings:
- "none"
- Messages are not written anywhere.
- "stdout"
- Messages are written to the standard output.
- "stderr"
- Messages are written to the standard error.
- "syslog"
- Messages are written using the syslog(3)
facility.
- "both"
- Messages are written to the syslog facility and to the standard error
(this option is not available on all platforms).
- --log-level=LEVEL
- Set the severity of messages that will be logged. The levels from most
severe to least are: "emerg",
"alert",
"crit",
"err",
"warning",
"notice",
"info",
"debug". The default is
"info".
- --log-sysfacility=NUMBER
- Set the facility that syslog(3) uses for logging
messages. This switch takes a number as an argument. The default is a
value that corresponds to "LOG_USER" on
the system where rwrecgenerator is running. This switch produces an
error unless --log-destination=syslog is specified.
The following are general purpose switches. None are required.
- --seed=SEED
- Seed the pseudo-random number generator with the value SEED. When
not specified, rwrecgenerator creates its own seed. Specifying the
seed allows different invocations of rwrecgenerator to produce the
same output (assuming the same value is given for all switches and that
the time window is specified).
- --start-time=YYYY/MM/DD[:HH[:MM[:SS[.ssssss]]]]
- --start-time=EPOCH_SECONDS_PLUS_MILLISECONDS
- Specify the earliest date and time at which an event is started. The
specified time must be given to at least day precision. Any parts of the
date-time string that are not specified are set to 0. The switch also
accepts UNIX epoch seconds with optional fractional seconds. When not
specified, defaults to the beginning of the previous hour.
- --end-time=YYYY/MM/DD[:HH[:MM[:SS[.ssssss]]]]
- --end-time=EPOCH_SECONDS_PLUS_MILLISECONDS
- Specify the latest date and time at which an event is started. This time
does not specify the latest end-time for the flow records or even
the latest start-time, since many events simulate a query/response pair,
with the response following the query by a few milliseconds. The specified
time must be given to at least day precision, and it must not be less than
the start-time. Any parts of the date-time string that are not specified
are set to 0. The switch also accepts UNIX epoch seconds with optional
fractional seconds. When not specified, defaults to the end of the
previous hour.
- --time-step=MILLISECONDS
- Move forward MILLISECONDS milliseconds at each step as
rwrecgenerator moves through the time window. When not specified,
defaults to the difference between the start-time and end-time; that is,
rwrecgenerator will generate events at the start-time and then at
the end-time. A MILLISECONDS value of 0 indicates
rwrecgenerator should only create events at the start-time.
- --events-per-step=COUNT
- Create COUNT events at each time step. The default is 1.
- --help
- Print the available options and exit.
- --version
- Print the version number and information about how rwrecgenerator
was configured, then exit the application.
The following switches are used when creating incremental files.
- --processing-directory=DIR_PATH
- Name the directory under the incremental files are initially created. Any
files in this directory are removed when rwrecgenerator is started.
When the flush timeout is reached, the files are closed and moved from
this directory to the output-directory. If --num-subprocesses is
specified, subdirectories are created under DIR_PATH, and each
subprocess is given its own subdirectory.
- --num-subprocesses=COUNT
- Tell rwrecgenerator to create COUNT subprocesses to generate
incremental files. This switch is ignored when incremental files are not
being created. When this switch is specified, rwrecgenerator
creates subdirectories below the processing directory. The default value
for COUNT is 0.
- --flush-timeout=MILLISECONDS
- Set the timeout for flushing any in-memory records to disk to
MILLISECONDS milliseconds. At this time, the incremental files are
closed and the files are moved from the processing directory to the output
directory. The timeout uses the internal time as rwrecgenerator
moves through the time window. If not specified, the default is 30,000
milliseconds (30 seconds). This switch is ignored when incremental files
are not being created.
- --file-cache-size=SIZE
- Set the maximum number of data files to have open for writing at any one
time to SIZE. If not specified, the default is 32 files.
- --compression-method=COMP_METHOD
- Specify the compression library to use when writing binary output files.
If this switch is not given, the value in the SILK_COMPRESSION_METHOD
environment variable is used if the value names an available compression
method. When no compression method is specified, binary output is
compressed using the default chosen when SiLK was compiled. The valid
values for COMP_METHOD are determined by which external libraries
were found when SiLK was compiled. To see the available compression
methods and the default method, use the --help or --version
switch. SiLK can support the following COMP_METHOD values when the
required libraries are available.
- none
- Do not compress the SiLK Flow records using an external library.
- zlib
- Use the zlib(3) library for compressing the flow
records.
- lzo1x
- Use the lzo1x algorithm from the LZO real-time compression library
for compressing the flow records.
- snappy
- Use the snappy library for compressing the flow records.
Since SiLK 3.13.0.
- best
- Use lzo1x if available, otherwise use snappy if available, otherwise use
zlib if available.
The following switches can be used when creating textual output.
- --timestamp-format=FORMAT
- When producing textual output, specify the format, timezone, and/or
modifier to use when printing timestamps. When this switch is not
specified, the SILK_TIMESTAMP_FORMAT environment variable is checked for a
format, timezone, and modifier. If it is empty or contains invalid values,
timestamps are printed in the default format, and the timezone is UTC
unless SiLK was compiled with local timezone support. FORMAT is a
comma-separated list of a format, a timezone, and/or a modifier. The
format is one of:
- default
- Print the timestamps as
YYYY/MM/DDThh:mm:ss.sss.
- iso
- Print the timestamps as YYYY-MM-DD
hh:mm:ss.sss.
- m/d/y
- Print the timestamps as MM/DD/YYYY
hh:mm:ss.sss.
- epoch
- Print the timestamps as the number of seconds since 00:00:00 UTC on
1970-01-01.
When a timezone is specified, it is used regardless of the default
timezone support compiled into SiLK. The timezone is one of:
- utc
- Use Coordinated Universal Time to print timestamps.
- local
- Use the TZ environment variable or the local timezone.
One modifier is available:
- no-msec
- Truncate the milliseconds value on the timestamps and on the duration
field. When milliseconds are truncated, the sum of the printed start time
and duration may not equal the printed end time.
- --epoch-time
- When producing textual output, print timestamps as epoch time (number of
seconds since midnight GMT on 1970-01-01). This switch is equivalent to
--timestamp-format=epoch, it is deprecated as of SiLK 3.8.1, and it
will be removed in the SiLK 4.0 release.
- --ip-format=FORMAT
- When producing textual output, specify how IP addresses are printed, where
FORMAT is a comma-separated list of the arguments described below.
When this switch is not specified, the SILK_IP_FORMAT environment variable
is checked for a value and that format is used if it is valid. The default
FORMAT is "canonical". Since
SiLK 3.8.1.
- canonical
- Print IP addresses in the canonical format. For an IPv4 record, use
dot-separated decimal (192.0.2.1). For an IPv6
records, use either colon-separated hexadecimal
("2001:db8::1") a or mixed IPv4-IPv6
representation for IPv4-mapped IPv6 addresses (the ::ffff:0:0/96 netblock,
e.g., "::ffff:192.0.2.1") and
IPv4-compatible IPv6 addresses (the ::/96 netblock other than ::/127,
e.g., "::192.0.2.1").
- no-mixed
- Print IP addresses in the canonical format
(192.0.2.1 or
"2001:db8::1") but do not used the mixed
IPv4-IPv6 representations. For example, use
"::ffff:c000:201" instead of
"::ffff:192.0.2.1". Since SiLK
3.17.0.
- decimal
- Print IP addresses as integers in decimal format. For example, print
192.0.2.1 and
"2001:db8::1" as
3221225985 and
42540766411282592856903984951653826561,
respectively.
- hexadecimal
- Print IP addresses as integers in hexadecimal format. For example, print
192.0.2.1 and
"2001:db8::1" as
"c00000201" and
"20010db8000000000000000000000001",
respectively.
- zero-padded
- Make all IP address strings contain the same number of characters by
padding numbers with leading zeros. For example, print
192.0.2.1 and
"2001:db8::1" as
192.000.002.001 and
"2001:0db8:0000:0000:0000:0000:0000:0001",
respectively. For IPv6 addresses, this setting implies
"no-mixed", so that
"::ffff:192.0.2.1" is printed as
"0000:0000:0000:0000:0000:ffff:c000:0201".
As of SiLK 3.17.0, may be combined with any of the above, including
"decimal" and
"hexadecimal".
The following arguments modify certain IP addresses prior to
printing. These arguments may be combined with the above formats.
- map-v4
- Change IPv4 addresses to IPv4-mapped IPv6 addresses (addresses in the
::ffff:0:0/96 netblock) prior to formatting. Since SiLK
3.17.0.
- unmap-v6
- Change any IPv4-mapped IPv6 addresses (addresses in the ::ffff:0:0/96
netblock) to IPv4 addresses prior to formatting. Since SiLK
3.17.0.
The following argument is also available:
- force-ipv6
- Set FORMAT to
"map-v4","no-mixed".
- --integer-ips
- When producing textual output, print IP addresses as integers. This switch
is equivalent to --ip-format=decimal, it is deprecated as of SiLK
3.8.1, and it will be removed in the SiLK 4.0 release.
- --zero-pad-ips
- When producing textual output, print IP addresses as fully-expanded,
zero-padded values in their canonical form. This switch is equivalent to
--ip-format=zero-padded, it is deprecated as of SiLK 3.8.1, and it
will be removed in the SiLK 4.0 release.
- --integer-sensors
- When producing textual output, print the integer ID of the sensor rather
than its name.
- --integer-tcp-flags
- When producing textual output, print the TCP flag fields (flags,
initialFlags, sessionFlags) as an integer value. Typically, the characters
"F,S,R,P,A,U,E,C" are used to represent
the TCP flags.
- --no-titles
- When producing textual output, turn off column titles. By default, titles
are printed.
- --no-columns
- When producing textual output, disable fixed-width columnar output.
- --column-separator=C
- When producing textual output, use specified character between columns and
after the final column. When this switch is not specified, the default of
'|' is used.
- --no-final-delimiter
- When producing textual output, do not print the column separator after the
final column. Normally a delimiter is printed.
- --delimited
- --delimited=C
- When producing textual output, run as if --no-columns
--no-final-delimiter --column-sep=C had been
specified. That is, disable fixed-width columnar output; if character
C is provided, it is used as the delimiter between columns instead
of the default '|'.
The following switches control the class/type and sensor that
rwrecgenerator assigns to every flow record.
- --sensor-prefix-map=FILE
- Load a prefix map from FILE and use it to map from the internal IP
addresses to sensor numbers. If the switch is not provided, all flow
records will use the first sensor in the silk.conf file that is
supported by the class named in the flowtype. The sensor IDs specified in
FILE should agree with the class specified in the
--flowtype-* switches.
- --flowtype-in=CLASS/TYPE
- Set the class/type pair for flow records where the source IP is external,
the destination IP is internal, and the flow record is not considered to
represent a web record to CLASS/TYPE. Web records are
those that appear on ports 80/tcp, 443/tcp, and 8080/tcp. When not
specified, rwrecgenerator attempts to find the flowtype
"all/in" in the silk.conf file.
- --flowtype-inweb=CLASS/TYPE
- Set the class/type pair for flow records representing web records where
the source IP is external and the destination IP is internal to
CLASS/TYPE. When not specified and the --flowtype-in
switch is given, that CLASS/TYPE pair will be used. When
neither this switch nor --flowtype-in is given,
rwrecgenerator attempts to find the flowtype "all/inweb"
in the silk.conf file.
- --flowtype-out=CLASS/TYPE
- Set the class/type pair for flow records where the source IP is internal,
the destination IP is external, and the flow record is not considered to
represent a web record to CLASS/TYPE. When not
specified, rwrecgenerator attempts to find the flowtype
"all/out" in the silk.conf file.
- --flowtype-outweb=CLASS/TYPE
- Set the class/type pair for flow records representing web records where
the source IP is internal and the destination IP is external to
CLASS/TYPE. When not specified and the --flowtype-out
switch is given, that CLASS/TYPE pair will be used. When
neither this switch nor --flowtype-out is given,
rwrecgenerator attempts to find the flowtype "all/outweb"
in the silk.conf file.
- --site-config-file=FILENAME
- Read the SiLK site configuration from the named file FILENAME. When
this switch is not provided, rwrecgenerator searches for the site
configuration file in the locations specified in the "FILES"
section.
- SILK_IP_FORMAT
- This environment variable is used as the value for --ip-format when
that switch is not provided. Since SiLK 3.11.0.
- SILK_TIMESTAMP_FORMAT
- This environment variable is used as the value for
--timestamp-format when that switch is not provided. Since
SiLK 3.11.0.
- SILK_COMPRESSION_METHOD
- This environment variable is used as the value for
--compression-method when that switch is not provided. Since
SiLK 3.13.0.
- SILK_CONFIG_FILE
- This environment variable is used as the value for the
--site-config-file when that switch is not provided.
- SILK_DATA_ROOTDIR
- This environment variable specifies the root directory of data repository.
As described in the "FILES" section, rwrecgenerator may
use this environment variable when searching for the SiLK site
configuration file.
- SILK_CLOBBER
- The SiLK tools normally refuse to overwrite existing files. Setting
SILK_CLOBBER to a non-empty value removes this restriction.
- SILK_PATH
- This environment variable gives the root of the install tree. When
searching for configuration files, rwrecgenerator may use this
environment variable. See the "FILES" section for details.
- TZ
- When the argument to the --timestamp-format switch includes
"local" or when a SiLK installation is
built to use the local timezone, the value of the TZ environment variable
determines the timezone in which rwrecgenerator displays
timestamps. (If both of those are false, the TZ environment variable is
ignored.) If the TZ environment variable is not set, the machine's default
timezone is used. Setting TZ to the empty string or 0 causes timestamps to
be displayed in UTC. For system information on the TZ variable, see
tzset(3) or environ(7). (To
determine if SiLK was built with support for the local timezone, check the
"Timezone support" value in the output
of rwrecgenerator --version.) The TZ environment variable is also
used when rwrecgenerator parses the timestamp specified in the
--start-time or --end-time switches if SiLK is built with
local timezone support.
- ${SILK_CONFIG_FILE}
- ${SILK_DATA_ROOTDIR}/silk.conf
- /data/silk.conf
- ${SILK_PATH}/share/silk/silk.conf
- ${SILK_PATH}/share/silk.conf
- /usr/local/share/silk/silk.conf
- /usr/local/share/silk.conf
- Possible locations for the SiLK site configuration file which are checked
when the --site-config-file switch is not provided.
silk(7), rwcut(1),
rwflowpack(8), silk.conf(5),
syslog(3), zlib(3),
tzset (3), environ(7)
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |