|
|
| |
rwtuc(1) |
SiLK Tool Suite |
rwtuc(1) |
rwtuc - Text Utility Converter - rwcut output to SiLK flows
rwtuc [--fields=FIELDS] [--column-separator=CHAR]
[--output-path=PATH] [--bad-input-lines=FILEPATH]
[--verbose] [--stop-on-error] [--no-titles] [--note-add=TEXT]
[--note-file-add=FILE] [--compression-method=COMP_METHOD]
[--site-config-file=FILENAME] [--saddress=IPADDR]
[--daddress=IPADDR] [--sport=NUM] [--dport=NUM]
[--protocol=NUM] [--packets=NUM] [--bytes=NUM]
[--flags-all=TCPFLAGS] [--stime=TIME] [--duration=NUM]
[--etime=TIME] [--sensor=SID] [--input-index=NUM]
[--output-index=NUM] [--next-hop-ip=IPADDR]
[--flags-initial=TCPFLAGS] [--flags-session=TCPFLAGS]
[--attributes=ATTR] [--application=NUM] [--class=NAME]
[--type=NAME] [--stime+msec=TIME] [--etime+msec=TIME]
[--duration+msec=NUM] [--icmp-type=NUM] [--icmp-code=NUM]
{[--xargs] | [--xargs=FILENAME] | [FILE [FILE...]]}
rwtuc --help
rwtuc --version
rwtuc reads text files that have a format similar to that produced by
rwcut(1) and attempts to create a SiLK Flow record for
each line of input.
The fields which make up a single record should be separated by
the pipe character ('|'); use the --column-separator switch to
change this delimiter. Note that the space character does not work as
delimiter since several fields (e.g., time, TCP-flags) may contain embedded
spaces.
The fields to be read from each line may be specified with the
--fields switch; if the switch is not provided, rwtuc treats
the first line as a title and attempts to determine the fields from the
title strings.
When --fields is specified, rwtuc still checks
whether the first line contains title strings, and rwtuc skips the
line if it determines it does. Specify the --no-titles switch to
force rwtuc to treat the first line as field values to be parsed.
Command line switches exist which force a field to have a fixed
value. These switches cause rwtuc to override the value read from the
input file (if any) for those fields. See the "Fixed Values"
section below for details.
rwtuc reads the textual input from the files named on the
command line or from the standard input when no file names are specified,
when --xargs is not present, and when the standard input is not a
terminal. To read the standard input in addition to the named files or to
force rwfileinfo to read input from a terminal, use
"-" or
"stdin" as a file name. When the
--xargs switch is provided, rwtuc reads the names of the files
to process from the named text file or from the standard input if no file
name argument is provided to the switch. The input to --xargs must
contain one file name per line.
When the --output-path switch is not provided, output is
sent to the standard output when it is not connected to a terminal.
By default, lines that cannot be parsed are silently ignored
(unless rwtuc is attempting to determine the fields from the title
line). When the --verbose switch is specified, problems parsing an
input line are reported to the standard error, and rwtuc continues to
process the input. The --stop-on-error switch is similar to the
--verbose switch, except processing stops after the first error.
Input lines that cause parse errors may be copied to another output stream
with the --bad-input-lines switch. Each bad line has the source file
name and line number prepended to it, separated from each other and the
source line by colons (':').
Due to the way SiLK Flow records are stored, certain field combinations cannot
be supported, certain fields must appear together, and some fields may only be
used on certain occasions:
- Only two of the three time-related values (start time, duration, end time)
may be specified. When all three are specified, the end time is ignored.
This affects the "sTime,9",
"duration,10", and
"eTime,11" fields and the
--stime, --duration, and --etime switches.
- Both ICMP type and ICMP code must be present when one is present. These
may be set by a combination of the
"iType" and
"iCode" fields and the
--icmp-type and --icmp-code switches. These values are
ignored unless either the protocol is ICMP (1) or the record contains IPv6
addresses and the protocol is ICMPv6 (58). The ICMP type and code are
encoded in the destination port field
("dPort,4" or --dport), and they
overwrite the port value for ICMP and ICMPv6 flow records.
- Both initial TCP flags and session TCP flags must be present when one is
present. These may be set by a combination of the
"initialFlags,26" and
"sessionFlags,27" fields and the
--flags-initial and --flags-session switches. These fields
are set to 0 for non-TCP flow records. When either field has a non-zero
value, any value in the (ALL) TCP flags field
("flags,8" or --flags-all) is
overwritten for TCP flow records.
- If the silk.conf(5) file defines more than one class,
both class and type must be present for the values to have any affect on
the SiLK flow record. These may be set by a combination of the
"class" and
"type" fields and the --class and
--type switches. If silk.conf defines a single class, that
class is used by default. The class and type must map to a valid pair; use
rwsiteinfo --fields=class,type to see the list of valid
class/type pairs for your site (cf.
rwsiteinfo(1)).
Option names may be abbreviated if the abbreviation is unique or is an exact
match for an option. A parameter to an option may be specified as
--arg=param or --arg param, though the first form
is required for options that take optional parameters.
- --fields=FIELDS
- FIELDS contains the list of fields (columns) to parse.
FIELDS is a comma separated list of field-names, field-integers,
and ranges of field-integers; a range is specified by separating the start
and end of the range with a hyphen (-). Field-names are case
insensitive. A field name may not be specified more than once. (As of SiLK
3.15.0, "ignore" may appear multiple
times, allowing multiple input fields to be ignored.)
A field is ignored when its name corresponds to a fixed value
switch (e.g. --protocol) given on the command line (see
"Fixed Values").
The field names and their descriptions are:
- ignore
- a field that rwtuc is to skip
- sIP,1
- source IP address in the canonical form: dotted-quad for IPv4 or
hex-encoded for IPv6 (when SiLK has been compiled with IPv6 support).
Integers from 0 to 4294967295 are treated as IPv4 addresses.
- dIP,2
- destination IP address in the same format as
"sIP,1"
- sPort,3
- source port as an integer from 0 to 65535 inclusive
- dPort,4
- destination port as an integer from 0 to 65535 inclusive (cf. "Field
Constraints")
- protocol,5
- IP protocol as an integer from 0 to 255 inclusive
- packets,pkts,6
- packet count as an integer from 1 to 4294967295 inclusive
- bytes,7
- byte count as an integer from 1 to 4294967295 inclusive
- flags,8
- bit-wise OR of TCP flags over all packets in the flow; the string may
contain "F",
"S",
"R",
"P",
"A",
"U",
"E",
"C" in upper- or lowercase (cf.
"Field Constraints")
- sTime,9
- starting time of the flow, in the form
"YYYY/MM/DD[:hh[:mm[:ss[.sss]]]]". The
letter "T" may be used in place of
":" to separate the day and hour fields.
A floating point value between 536870912 and 4294967295 is also allowed
and is treated as seconds since the UNIX epoch.
- duration,10
- duration of flow as a floating point value from 0.0 to 4294967.295
- eTime,11
- end time of flow in the same form as
"sTime,9" (cf. "Field
Constraints")
- sensor,12
- router sensor name or ID as given in silk.conf (cf.
silk.conf(5))
- class
- class of router at collection point as given in silk.conf (cf.
"Field Constraints")
- type
- type of router at collection point as given in silk.conf (cf.
"Field Constraints")
- in,13
- router SNMP input interface or vlanId; an integer from 0 to 65535
- out,14
- router SNMP output interface or postVlanId; an integer from 0 to
65535
- nhIP,15
- router next hop IP address in the same format as
"sIP,1"
- initialFlags,26
- TCP flags on first packet in the flow; same form as the
"flags,8" field (cf. "Field
Constraints")
- sessionFlags,27
- bit-wise OR of TCP flags on the second through final packet in the flow;
same form as the "flags,8" field (cf.
"Field Constraints")
- attribute,28
- flow attributes set by the flow generator:
- "S"
- all the packets in this flow record are exactly the same size
- "F"
- flow generator saw additional packets in this flow following a packet with
a FIN flag (excluding ACK packets)
- "T"
- flow generator prematurely created a record for a long-running connection
due to a timeout. (When the flow generator yaf(1) is
run with the --silk switch, it prematurely creates a flow and mark
it with "T" if the byte count of the
flow cannot be stored in a 32-bit value.)
- "C"
- flow generator created this flow as a continuation of long-running
connection, where the previous flow for this connection met a timeout (or
a byte threshold in the case of yaf).
Consider a long-running ssh session that exceeds the flow
generator's active timeout. (This is the active timeout since the
flow generator creates a flow for a connection that still has activity). The
flow generator will create multiple flow records for this ssh session, each
spanning some portion of the total session. The first flow record will be
marked with a "T" indicating that it hit
the timeout. The second through next-to-last records will be marked with
"TC" indicating that this flow both timed
out and is a continuation of a flow that timed out. The final flow will be
marked with a "C", indicating that it was
created as a continuation of an active flow.
- application,29
- guess as to the content of the flow, as an integer from 0 to 65535. Some
software that generates flow records from packet data, such as yaf,
will inspect the contents of the packets that make up a flow and use
traffic signatures to label the content of the flow. SiLK calls this label
the application; yaf refers to it as the appLabel.
The application is the port number that is traditionally used for that
type of traffic (see the /etc/services file on most UNIX systems).
For example, traffic that the flow generator recognizes as FTP will have a
value of 21, even if that traffic is being routed through the standard
HTTP/web port (80).
- iType
- ICMP type as an integer from 0 to 255 inclusive (cf. "Field
Constraints")
- iCode
- ICMP code as an integer from 0 to 255 inclusive (cf. "Field
Constraints")
- --column-separator=CHAR
- Use the character CHAR as the delimiter between columns (fields) in
the input. The default column separator is the vertical pipe
('"|"'). rwtuc normally ignores
whitespace (space and tab) around the column separator; however, using
space or tab as the separator causes each space or tab character to
be treated as a field delimiter. The newline character is not a valid
delimiter character since it is used to denote records.
- --output-path=PATH
- Write the binary SiLK Flow records to PATH, where PATH is a
filename, a named pipe, the keyword
"stderr" to write the output to the
standard error, or the keyword "stdout"
or "-" to write the output to the
standard output. If PATH names an existing file, rwtuc exits
with an error unless the SILK_CLOBBER environment variable is set, in
which case PATH is overwritten. When PATH ends in
".gz", the output is compressed using
the library associated with gzip(1). If this switch
is not given, the output is written to the standard output. Attempting to
write the binary output to a terminal causes rwtuc to exit with an
error.
- --bad-input-lines=FILEPATH
- Copy any lines which could not be parsed to FILEPATH. The strings
"stdout" and
"stderr" may be used for the standard
output and standard error, respectively. Each bad line is prepended by the
source input file, a colon, the line number, and a colon. On exit,
rwtuc removes FILEPATH if all input lines were successfully
parsed.
- --verbose
- When an input line fails to parse, print a message to the standard error
describing the problem. When this switch is not specified, parsing
failures are not reported. rwtuc continues to process the input
after printing the message. To stop processing when a parsing error
occurs, use --stop-on-error.
- --stop-on-error
- When an input line fails to parse, print a message to the standard error
describing the problem and exit the program. When this occurs, the output
file contains any records successfully created prior to reading the bad
input line. The default behavior of rwtuc is to silently ignore
parsing errors. To report parsing errors and continue processing the
input, use --verbose.
- --no-titles
- Parse the first line of the input as field values. Normally when the
--fields switch is specified, rwtuc examines the first line
to determine if the line contains the names (titles) of fields and skips
the line if it does. rwtuc exits with an error when
--no-titles is given but --fields is not.
- --note-add=TEXT
- Add the specified TEXT to the header of the output file as an
annotation. This switch may be repeated to add multiple annotations to a
file. To view the annotations, use the rwfileinfo(1)
tool.
- --note-file-add=FILENAME
- Open FILENAME and add the contents of that file to the header of
the output file as an annotation. This switch may be repeated to add
multiple annotations. Currently the application makes no effort to ensure
that FILENAME contains text; be careful that you do not attempt to
add a SiLK data file as an annotation.
- --compression-method=COMP_METHOD
- Specify the compression library to use when writing output files. If this
switch is not given, the value in the SILK_COMPRESSION_METHOD environment
variable is used if the value names an available compression method. When
no compression method is specified, output to the standard output or to
named pipes is not compressed, and output to files is compressed using the
default chosen when SiLK was compiled. The valid values for
COMP_METHOD are determined by which external libraries were found
when SiLK was compiled. To see the available compression methods and the
default method, use the --help or --version switch. SiLK can
support the following COMP_METHOD values when the required
libraries are available.
- none
- Do not compress the output using an external library.
- zlib
- Use the zlib(3) library for compressing the output,
and always compress the output regardless of the destination. Using zlib
produces the smallest output files at the cost of speed.
- lzo1x
- Use the lzo1x algorithm from the LZO real time compression library
for compression, and always compress the output regardless of the
destination. This compression provides good compression with less memory
and CPU overhead.
- snappy
- Use the snappy library for compression, and always compress the
output regardless of the destination. This compression provides good
compression with less memory and CPU overhead. Since SiLK
3.13.0.
- best
- Use lzo1x if available, otherwise use snappy if available, otherwise use
zlib if available. Only compress the output when writing to a file.
- --site-config-file=FILENAME
- Read the SiLK site configuration from the named file FILENAME. When
this switch is not provided, rwtuc searches for the site
configuration file in the locations specified in the "FILES"
section.
- --xargs
- --xargs=FILENAME
- Read the names of the input files from FILENAME or from the
standard input if FILENAME is not provided. The input is expected
to have one filename per line. rwtuc opens each named file in turn
and reads text from it as if the filenames had been listed on the command
line. Since SiLK 3.15.0.
- --help
- Print the available options and exit.
- --version
- Print the version number and information about how SiLK was configured,
then exit the application.
The following switches may be used to set fields to fixed values. A value
specified using one these switches overrides the field when it appears in the
input, causing that column of input to be completely ignored.
- --saddress=IPADDR
- Set the source address field to IPADDR for all records.
IPADDR may be in canonical notation or an unsigned integer.
- --daddress=IPADDR
- Set the destination address field to IPADDR for all records.
IPADDR may be in canonical notation or an unsigned integer.
- --sport=NUM
- Set the source port field to NUM for all records; a value between 0
and 65535.
- --dport=NUM
- Set the destination port field to NUM for all records; a value
between 0 and 65535. (cf. "Field Constraints")
- --protocol=NUM
- Set the protocol field to NUM for all records; a value between 0
and 255.
- --packets=NUM
- Set the packets field to NUM for all records; the value must be
non-zero.
- --bytes=NUM
- Set the bytes field to NUM for all records; the value must be
non-zero.
- --flags-all=TCPFLAGS
- Set the TCP flags field to TCPFLAGS for all records. (cf.
"Field Constraints")
- --stime=TIME
- Set the start time field to TIME for all records.
- --duration=NUM
- Set the duration field to NUM for all records.
- --etime=TIME
- Set the end time field to TIME for all records. (cf. "Field
Constraints")
- --sensor=SID
- Set the sensor field to SID for all records. This may either be a
sensor name or sensor ID.
- --input-index=NUM
- Set the SNMP input index field to NUM for all records; a value
between 0 and 65535.
- --output-index=NUM
- Set the SNMP output index field to NUM for all records; a value
between 0 and 65535.
- --next-hop-ip=IPADDR
- Set the next-hop-ip field to IPADDR for all records. IPADDR
may be in canonical notation or an unsigned integer.
- --flags-initial=TCPFLAGS
- Set the initial TCP flags field to TCPFLAGS for all records. (cf.
"Field Constraints")
- --flags-session=TCPFLAGS
- Set the session TCP flags field to TCPFLAGS for all records. (cf.
"Field Constraints")
- --attributes=ATTR
- Set the attributes field to ATTR for all records.
- --application=NUM
- Set the application field to NUM for all records; a value between 0
and 65535.
- --class=NAME
- Set the class field to NAME for all records. (cf. "Field
Constraints")
- --type=NAME
- Set the type field to NAME for all records. (cf. "Field
Constraints")
- --icmp-type=NUM
- Set the ICMP type field to NUM for all ICMP or ICMPv6 flow records;
a value between 0 and 255. (cf. "Field Constraints")
- --icmp-code=NUM
- Set the ICMP code field to NUM for all ICMP or ICMPv6 flow records;
a value between 0 and 255. (cf. "Field Constraints")
- --stime+msec=TIME
- An alias for --stime. This switch is deprecated as of SiLK 3.6.0,
and it will be removed in the SiLK 4.0 release.
- --etime+msec=TIME
- An alias for --etime. This switch is deprecated as of SiLK 3.6.0,
and it will be removed in the SiLK 4.0 release.
- --duration+msec=NUM
- An alias for --duration. This is is deprecated as of SiLK 3.6.0,
and it will be removed in the SiLK 4.0 release.
In the following examples, the dollar sign
("$") represents the shell prompt. The text
after the dollar sign represents the command line. Lines have been wrapped for
improved readability, and the back slash
("\") is used to indicate a wrapped line.
Using rwtuc to parse the output of
rwcut(1) should produce the same output:
$ rwcut data.rw > cut.txt
$ md5 < cut.txt
7e3d693cd2cba2510803935274e1debd
$ rwtuc < cut.txt | rwcut | md5
7e3d693cd2cba2510803935274e1debd
To swap the source IP and port with the destination IP and port in
flows.rw and save the result in reverse.rw:
$ rwcut --fields=dip,dport,sip,sport,5-15,20-29 flows.rw \
| rwtuc --fields=1-15,20-29 --output-path=reverse.rw
rwtuc may be used to obfuscate the flow data in
myflows.rw to produce obflows.rw. Pipe the output from
rwcut into a script that manipulates the IP addresses, then pipe that
into rwtuc. Using the sed(1) script in
priv.sed, the invocation is:
$ rwcut --fields=1-10,13-15,26-29 myflows.rw \
| sed -f priv.sed \
| rwtuc --sensor=1 > obflows.rw
If the first line of input appears to contain titles, rwtuc
ignores it. In the first invocation below, rwtuc treats
"SP" as an abbreviation for
"sPort" and ignores the line. Use the
--no-titles switch to force rwtuc to parse the line:
$ echo 'SP' | rwtuc --fields=flags | rwcut --fields=flags
flags|
$
$ echo 'SP' | rwtuc --fields=flags --no-titles | rwcut --fields=flags
flags|
S P |
$
By default, rwtuc silently ignores lines that it cannot
parse. Use the --verbose flag to see error messages:
$ echo sport | rwtuc --fields=flags --no-titles --verbose >/dev/null
rwtuc: stdin:1: Invalid flags 'sport': Unexpected character 'o'
- SILK_CLOBBER
- The SiLK tools normally refuse to overwrite existing files. Setting
SILK_CLOBBER to a non-empty value removes this restriction.
- SILK_COMPRESSION_METHOD
- This environment variable is used as the value for
--compression-method when that switch is not provided. Since
SiLK 3.13.0.
- SILK_CONFIG_FILE
- This environment variable is used as the value for the
--site-config-file when that switch is not provided.
- SILK_DATA_ROOTDIR
- This environment variable specifies the root directory of data repository.
As described in the "FILES" section, rwtuc may use this
environment variable when searching for the SiLK site configuration
file.
- SILK_PATH
- This environment variable gives the root of the install tree. When
searching for configuration files, rwtuc may use this environment
variable. See the "FILES" section for details.
- TZ
- When a SiLK installation is built to use the local timezone (to determine
if this is the case, check the "Timezone
support" value in the output from rwtuc --version), the
value of the TZ environment variable determines the timezone in which
rwtuc parses timestamps. If the TZ environment variable is not set,
the default timezone is used. Setting TZ to 0 or the empty string causes
timestamps to be parsed as UTC. The value of the TZ environment variable
is ignored when the SiLK installation uses utc. For system information on
the TZ variable, see tzset(3) or
environ (7).
- ${SILK_CONFIG_FILE}
- ${SILK_DATA_ROOTDIR}/silk.conf
- /data/silk.conf
- ${SILK_PATH}/share/silk/silk.conf
- ${SILK_PATH}/share/silk.conf
- /usr/local/share/silk/silk.conf
- /usr/local/share/silk.conf
- Possible locations for the SiLK site configuration file which are checked
when the --site-config-file switch is not provided.
rwcut(1), rwfileinfo(1),
rwsiteinfo(1), silk.conf(5),
silk(7), yaf(1),
gzip(1), sed(1),
zlib(3), tzset(3),
environ (7)
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |