GSP
Quick Navigator

Search Site

Unix VPS
A - Starter
B - Basic
C - Preferred
D - Commercial
MPS - Dedicated
Previous VPSs
* Sign Up! *

Support
Contact Us
Online Help
Handbooks
Domain Status
Man Pages

FAQ
Virtual Servers
Pricing
Billing
Technical

Network
Facilities
Connectivity
Topology Map

Miscellaneous
Server Agreement
Year 2038
Credits
 

USA Flag

 

 

Man Pages
rwbagbuild(1) SiLK Tool Suite rwbagbuild(1)

rwbagbuild - Create a binary Bag from non-flow data

  rwbagbuild { --set-input=SETFILE | --bag-input=TEXTFILE }
        [--delimiter=C] [--proto-port-delimiter=C]
        [--default-count=DEFAULTCOUNT]
        [--key-type=FIELD_TYPE] [--counter-type=FIELD_TYPE]
        [{ --pmap-file=PATH | --pmap-file=MAPNAME:PATH }]
        [--note-add=TEXT] [--note-file-add=FILE]
        [--invocation-strip] [--compression-method=COMP_METHOD]
        [--output-path=PATH]

  rwbagbuild --help

  rwbagbuild --version

rwbagbuild builds a binary Bag file from an IPset file or from textual input. A Bag is a set of keys where each key is associated with a counter. Usually the key is some aspect of a flow record (an IP address, a port, the protocol, et cetera), and the counter is a volume (such as the number of flow records or the sum or bytes or packets) for the flow records that match that key.

Either --set-input or --bag-input must be provided to specify the type and the location of the input file. To read from the standard input, specify "stdin" or "-" as the argument to the switch.

When creating a Bag from an IPset, the value associated with each IP address is the value specified by the --default-count switch or 1 if the switch is not provided.

If the --key-type is "sip-country", "dip-country", or "any-country", each IP address is mapped to its country code using the country code mapping file (see "FILES") and that value is stored in the Bag file.

If the --key-type is "sip-pmap", "dip-pmap", or "any-ip-pmap", each IP address is mapped to a value found in the prefix map file specified in --pmap-file and that value is stored in the Bag file.

The textual input read from the argument to the --bag-input switch is processed a line at a time. Comments begin with a '"#"'-character and continue to the end of the line; they are stripped from each line. Any line that is blank or contains only whitespace is ignored. All other lines must contain a valid key or key-counter pair; whitespace around the key and counter is ignored.

A line may contain only a key or it may contain a key and counter separated by a delimiter. Use --delimiter to specify the delimiter; the accepted formats of the key are described below. If the delimiter character is not present on a line, the line must contain only a key, or a line may contain a key followed by a delimiter with no additional text on the line. In both cases, the count is set to 1. Otherwise, the line must contain a key before the delimiter and an integer counter after the delimiter. These lines may have a delimiter after the counter; this delimiter and any text following it are ignored.

The --default-count switch overrides any counter value present on the line, and any text appearing after the delimiter that follows the key is ignored.

For each key-count pair, the key is inserted into Bag with its count or, if the key is already present in the Bag, its total count is incremented by the count from this line. When using the --default-count switch, the count for a key that appears in the input N times is the product of N and DEFAULTCOUNT.

rwbagbuild prints an error and exits when a key or counter cannot be parsed.

Format of the Key

The key is a 32-bit integer, an IP address, a CIDR block, a SiLK IPWildcard, or a pair of numbers when the key-type is a protocol-port prefix map file.

For key-types that use fewer than 32-bits, rwbagbuild does not verify the validity of the key. For example, it is possible to have 257 as a key in Bag whose key-type is protocol.

rwbagbuild parses specific key-types as follows:

sIPv4, dIPv4, nhIPv4, any-IPv4
key is an IPv4 address or a 32-bit value; key-type set to corresponding IPv6 type when an IPv6 address is present. A CIDR block or SiLK IPWildcard representing multiple addresses adds multiple entries to the Bag
sIPv6, dIPv6, nhIPv6, any-IPv6
key is an IPv6 address. An IPv4 address is mapped into the ::ffff:0:0/96 netblock. All keys must be IP addresses.
flags, initialFlags, sessionFlags
key is the numeric value of the flags, 17 = FIN|ACK
sTime, eTime, any-time
key is seconds since the UNIX epoch
duration
key represents seconds
sensor
key is the numeric sensor ID
sip-country, dip-country, any-country
key is an IP address; the country_codes.pmap prefix map file is used to map the IP to a country code that is stored in the Bag
sip-pmap, dip-pmap, any-ip-pmap
key is an IP address; the specified --prefix-map file is used to map the IP to a value that is stored in the Bag
sport-pmap, dport-pmap, any-port-pmap
key is comprised of two numbers separated by a delimiter: a protocol (8-bit number) and a port (16-bit number). Those values are looked up in the specified --prefix-map file and the result is stored in the Bag. The delimiter separating the protocol and port may be set by --proto-port-delimiter. If not explicitly set, it is the same as the delimiter specified to --delimiter. The default delimiter is '"|"'.
attributes
these bits of the key are relevant, though any 32-bit value is accepted: 0x08=F, 0x10=S, 0x20=T, 0x40=C
class, type
key is treated as a number

An IP address or integer key must be expressed in one of the following formats. rwbagbuild complains if the key field contains a mixture of IPv6 addresses and integer values.

  • Dotted decimal---all 4 octets are required:

     10.1.2.4
        
  • An unsigned 32-bit integer:

     167838212
        
  • An IPv6 address in canonical format (when SiLK has been compiled with IPv6 support):

     2001:db8:a:1::2:4
     ::ffff:10.1.2.4
        
  • Any of the above with a CIDR designation---for dotted decimal all four octets are still required:

     10.1.2.4/31
     167838212/31
     2001:db8:a:1::2:4/127
     ::ffff:10.1.2.4/31
        
  • SiLK IP wildcard notation. A SiLK IP Wildcard can represent multiple IPv4 or IPv6 addresses. An IP Wildcard contains an IP in its canonical format, except each part of the IP (where part is an octet for IPv4 or a hexadectet for IPv6) may be a single value, a range, a comma separated list of values and ranges, or the letter "x" to signify all values for that part of the IP (that is, "0-255" for IPv4). You may not specify a CIDR suffix when using the IP Wildcard notation.

     10.x.1-2.4,5
     2001:db8:a:x::1-2:4,5
        

Option names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.

The following two switches control the type of input; one and only one must be provided:

--set-input=SETFILE
Create a Bag from an IPset. SETFILE is a filename, a named pipe, or the keyword "stdin" or "-" to read the IPset from the standard input. Counts have a volume of 1 when the --default-count switch is not specified. (IPsets are typically created by rwset(1) or rwsetbuild(1).)
--bag-input=TEXTFILE
Create a Bag from a delimited text file. TEXTFILE is a filename, a named pipe, or the keyword "stdin" or "-" to read the text from the standard input. See the "DESCRIPTION" section for the syntax of the TEXTFILE.
--delimiter=C
Expect the character C between each key-counter pair in the TEXTFILE read by the --bag-input switch. The default delimiter is the vertical pipe ('"|"'). The delimiter is ignored if the --set-input switch is specified. When the delimiter is a whitespace character, any amount of whitespace may surround and separate the key and counter. Since '"#"' is used to denote comments and newline is used to denote records, neither is a valid delimiter character.
--proto-port-delimiter=C
Expect the character C between the protocol and port that comprise a key when the --key-type is "sport-pmap", "dport-pmap", or "any-port-pmap". Unless this switch is specified, rwbagbuild expects the key-counter delimiter to appear between the protocol and port.
--default-count=DEFAULTCOUNT
Override the counts of all values in the input text or IPset with the value of DEFAULTCOUNT. DEFAULTCOUNT must be a positive integer.
--key-type=FIELD_TYPE
Write a entry into the header of the Bag file that specifies the key contains FIELD_TYPE values. When this switch is not specified, the key type of the Bag is set to "custom". The FIELD_TYPE is case insensitive. The supported FIELD_TYPEs are:
sIPv4
source IP address, IPv4 only
dIPv4
destination IP address, IPv4 only
sPort
source port
dPort
destination port
protocol
IP protocol
packets
packets, see also "sum-packets"
bytes
bytes, see also "sum-bytes"
flags
an unsigned bitwise OR of TCP flags
sTime
starting time of the flow record, seconds resolution
duration
duration of the flow record, seconds resolution
eTime
ending time of the flow record, seconds resolution
sensor
sensor ID
input
SNMP input
output
SNMP output
nhIPv4
next hop IP address, IPv4 only
initialFlags
TCP flags on first packet in the flow
sessionFlags
bitwise OR of TCP flags on all packets in the flow except the first
attributes
flow attributes set by the flow generator
application
guess as to the content of the flow, as set by the flow generator
class
class of the sensor
type
type of the sensor
icmpTypeCode
an encoded version of the ICMP type and code, where the type is in the upper byte and the code is in the lower byte
sIPv6
source IP, IPv6
dIPv6
destination IP, IPv6
nhIPv6
next hop IP, IPv6
records
count of flows
sum-packets
sum of packet counts
sum-bytes
sum of byte counts
sum-duration
sum of duration values
any-IPv4
a generic IPv4 address
any-IPv6
a generic IPv6 address
any-port
a generic port
any-snmp
a generic SNMP value
any-time
a generic time value, in seconds resolution
sip-country
the country code of the source IP addresss. For textual input, the key column must contain an IP address or an integer. rwbagbuild maps the IP address to a country code and stores the country code in the bag. Uses the mapping file specified by the SILK_COUNTRY_CODES environment variable or the country_codes.pmap mapping file, as described in "FILES". (See also ccfilter(3).) Since SiLK 3.12.0.
dip-country
the country code of the destination IP. See "sip-country". Since SiLK 3.12.0.
any-country
the country code of any IP address. See "sip-country". Since SiLK 3.12.0.
sip-pmap
a prefix map value found from a source IP address. Maps each IP address in the key column to a value from a prefix map file and stores the value in the bag. The type of the prefix map must be IPv4-address or IPv4-address. Use the --pmap-file switch to specify the path to the file. Since SiLK 3.12.0.
dip-pmap
a prefix map value found from a destination IP address. See "sip-pmap". Since SiLK 3.12.0.
any-ip-pmap:PMAP_PATH
a prefix map value found from any IP address. See "sip-pmap". Since SiLK 3.12.0.
sport-pmap
a prefix map value found from a protocol/source-port pair. Each key must contain two values, a protocol and a port. Maps each protocol/port pair to a value from a prefix map file and stores the value in the bag. The type of the prefix map must be proto-port. Use the --pmap-file switch to specify the path to the file. Since SiLK 3.12.0.
dport-pmap
a prefix map value found from a protocol/destination-port pair. See "sport-pmap". Since SiLK 3.12.0.
any-port-pmap
a prefix map value found from a protocol/port pair. See "sport-pmap". Since SiLK 3.12.0.
custom
a number
--counter-type=FIELD_TYPE
Write a entry into the header of the Bag file that specifies the counter contains FIELD_TYPE values. When this switch is not specified, the counter type of the Bag is set to "custom". Although the supported FIELD_TYPEs are the same as those for the key, the value is always treated as a number that can be summed. rwbagbuild does not use the country code or prefix map when parsing the value field.
--pmap-file=PATH
--pmap-file=MAPNAME:PATH
When the key-type is one of "sip-pmap", "dip-pmap", "any-ip-pmap", "sport-pmap", "dport-pmap", or "any-port-pmap", use the prefix map file located at PATH to map the key to a string. Specify PATH as "-" or "stdin" to read from the standard input. A map-name may be included in the argument to the switch, but rwbagbuild currently does not use the map-name. To create a prefix map file, use rwpmapbuild(1). Since SiLK 3.12.0.
--note-add=TEXT
Add the specified TEXT to the header of the output file as an annotation. This switch may be repeated to add multiple annotations to a file. To view the annotations, use the rwfileinfo(1) tool.
--note-file-add=FILENAME
Open FILENAME and add the contents of that file to the header of the output file as an annotation. This switch may be repeated to add multiple annotations. Currently the application makes no effort to ensure that FILENAME contains text; be careful that you do not attempt to add a SiLK data file as an annotation.
--invocation-strip
Do not record the command used to create the Bag file in the output. When this switch is not given, the invocation is written to the file's header, and the invocation may be viewed with rwfileinfo(1). Since SiLK 3.12.0.
--compression-method=COMP_METHOD
Specify the compression library to use when writing output files. If this switch is not given, the value in the SILK_COMPRESSION_METHOD environment variable is used if the value names an available compression method. When no compression method is specified, output to the standard output or to named pipes is not compressed, and output to files is compressed using the default chosen when SiLK was compiled. The valid values for COMP_METHOD are determined by which external libraries were found when SiLK was compiled. To see the available compression methods and the default method, use the --help or --version switch. SiLK can support the following COMP_METHOD values when the required libraries are available.
none
Do not compress the output using an external library.
zlib
Use the zlib(3) library for compressing the output, and always compress the output regardless of the destination. Using zlib produces the smallest output files at the cost of speed.
lzo1x
Use the lzo1x algorithm from the LZO real time compression library for compression, and always compress the output regardless of the destination. This compression provides good compression with less memory and CPU overhead.
snappy
Use the snappy library for compression, and always compress the output regardless of the destination. This compression provides good compression with less memory and CPU overhead. Since SiLK 3.13.0.
best
Use lzo1x if available, otherwise use snappy if available, otherwise use zlib if available. Only compress the output when writing to a file.
--output-path=PATH
Write the binary Bag output to PATH, where PATH is a filename, a named pipe, the keyword "stderr" to write the output to the standard error, or the keyword "stdout" or "-" to write the output to the standard output. If PATH names an existing file, rwbagtool exits with an error unless the SILK_CLOBBER environment variable is set, in which case PATH is overwritten. If this switch is not given, the output is written to the standard output. Attempting to write the binary output to a terminal causes rwbagtool to exit with an error.
--help
Print the available options and exit.
--version
Print the version number and information about how SiLK was configured, then exit the application.

In the following examples, the dollar sign ("$") represents the shell prompt. The text after the dollar sign represents the command line. Lines have been wrapped for improved readability, and the back slash ("\") is used to indicate a wrapped line.

Assume the file mybag.txt contains the following lines, where each line contains an IP address, a comma as a delimiter, a count, and ends with a newline.

 192.168.0.1,5
 192.168.0.2,500
 192.168.0.3,3
 192.168.0.4,14
 192.168.0.5,5

To build a bag with it:

 $ rwbagbuild --bag-input=mybag.txt --delimiter=, > mybag.bag

Use rwbagcat(1) to view its contents:

 $ rwbagcat mybag.bag
     192.168.0.1|                   5|
     192.168.0.2|                 500|
     192.168.0.3|                   3|
     192.168.0.4|                  14|
     192.168.0.5|                   5|

To create a Bag of protocol data from the text file myproto.txt:

   1|      4|
   6|    138|
  17|    131|

use

 $ rwbagbuild --key-type=proto --bag-input=myproto.txt > myproto.bag
 $ rwbagcat myproto.bag
          1|                   4|
          6|                 138|
         17|                 131|

When the --key-type switch is specified, rwbagcat knows the keys should be printed as integers, and rwfileinfo(1) shows the type of the key:

 $ rwfileinfo --fields=bag myproto.bag
 myproto.bag:
   bag            key: protocol @ 4 octets; counter: custom @ 8 octets

Without the --key-type switch, rwbagbuild assumes the integers in myproto.txt represent IP addresses:

 $ rwbagbuild --bag-input=myproto.txt | rwbagcat
         0.0.0.1|                   4|
         0.0.0.6|                 138|
        0.0.0.17|                 131|

Although the --key-format switch on rwbagcat may be used to choose how the keys are displayed, it is generally better to use the --key-type switch when creating the bag.

$ rwbagbuild --bag-input=myproto.txt | rwbagcat --key-format=decimal 1| 4| 6| 138| 17| 131|

To ignore the counts that exist in myproto.txt and set the counts for each protocol to 1, use the --default-count switch which overrides the existing value:

 $ rwbagbuild --key-type=protocol --bag-input=myproto.txt  \
        --default-count=1 --output-path=myproto1.bag
 $ rwbagcat myproto1.bag
          1|                   1|
          6|                   1|
         17|                   1|

To create a bag from multiple text files (X.txt, Y.txt, and Z.txt), use the UNIX cat(1) utility to concatenate the files and have rwbagbuild read the combined input. To avoid creating a temporary file, feed the output of cat as the standard input to rwbagbuild.

 $ cat X.txt Y.txt Z.txt                                \
   | rwbagbuild --bag-input=- --output-path=xyz.bag

For each key that appears in multiple input files, rwbagbuild sums the counters for the key.

Given the IP set myset.set, create a bag where every entry in the bag has a count of 3:

 $ rwbagbuild --set-input=myset.set --default-count=3  \
        --out=mybag2.bag

Suppose we have three IPset files, A.set, B.set, and C.set:

 $ rwsetcat A.set
 10.0.0.1
 10.0.0.2
 $ rwsetcat B.set
 10.0.0.2
 10.0.0.3
 $ rwsetcat C.set
 10.0.0.1
 10.0.0.2
 10.0.0.4

We want to create a bag file from these IPset files where the count for each IP address is the number of files that IP appears in. rwbagbuild accepts a single file as an argument, so we cannot do the following:

 $ rwbagbuild --set-input=A.set --set-input=B.set ...   # WRONG!

(Even if we could repeat the --set-input switch, specifying it multiple times would be annoying if we had 300 files instead of only 3.)

Since IPset files are (mathematical) sets, joining them together first with rwsettool(1) and then running rwbagbuild causes each IP address to get a count of 1:

 $ rwsettool --union A.set B.set C.set   \
   | rwbagbuild --set-input=-            \
   | rwbagcat
        10.0.0.1|                   1|
        10.0.0.2|                   1|
        10.0.0.3|                   1|
        10.0.0.4|                   1|

When rwbagbuild is processing textual input, it sums the counters for keys that appear in the input multiple times. We can use rwsetcat(1) to convert each IPset file to text and feed that as single textual stream to rwbagbuild. Use the --cidr-blocks switch on rwsetcat to reduce the amount of input that rwbagbuild must process. This is probably the best approach to the problem:

 $ rwsetcat --cidr-block *.set | rwbagbuild --bag-input=- > total1.bag
 $ rwbagcat total1.bag
        10.0.0.1|                   2|
        10.0.0.2|                   3|
        10.0.0.3|                   1|
        10.0.0.4|                   1|

A less efficient solution is to convert each IPset to a bag and then use rwbagtool(1) to add the bags together:

 $ for i in *.set ; do
        rwbagbuild --set-input=$i --output-path=/tmp/$i.bag ;
   done
 $ rwbagtool --add /tmp/*.set.bag > total2.bag
 $ rm /tmp/*.set.bag

There is no need to create a bag file for each IPset; we can get by with only two bag files, the final bag file, total3.bag, and a temporary file, tmp.bag. We initialize total3.bag to an empty bag. As we loop over each IPset, rwbagbuild converts the IPset to a bag on its standard output, rwbagtool creates tmp.bag by adding its standard input to total3.bag, and we rename tmp.bag to total3.bag:

 $ rwbagbuild --bag-input=/dev/null --output-path=total3.bag
 $ for i in *.set ; do
        rwbagbuild --set-input=$i  \
        | rwbagtool --output-path=tmp.bag --add total3.bag stdin ;
        /bin/mv tmp.bag total3.bag ;
   done
 $ rwbagcat total3.bag
        10.0.0.1|                   2|
        10.0.0.2|                   3|
        10.0.0.3|                   1|
        10.0.0.4|                   1|

As of SiLK 3.12.0, a Bag file may contain a country code as its key. In rwbagbuild, specify the --key-type as "sip-country", "dip-country", or "any-country". That key-type works with either textual input or IPset input. The form of the textual input when mapping an IP address to a country code is identical to that when building an ordinary bag.

 $ rwbagbuild --bag-input=mybag.txt --delimiter=,       \
        --key-type=any-country --output-path=scc1.bag
 $ rwbagcat scc1.bag
 --|                 527|

 $ rwbagbuild --set-input=A.set --key-type=any-country  \
        --output-path=scc2.bag
 $ rwbagcat scc2.bag
 --|                   2|

rwbagbuild and rwbag(1) can use a prefix map file as the key in a Bag file as of SiLK 3.12.0. Use the --pmap-file switch to specify the prefix map file, and specify the --key-type using one of the types that end in "-pmap".

For a prefix map that maps by IP addresses, use a key-type of "sip-pmap", "dip-pmap", or "any-ip-pmap". The input may be an IPset or text. The form of the textual input is the same as for a normal bag file.

 $ rwbagbuild --set-input=A.set --key-type=sip-pmap     \
        --pmap-file=ip-map.pmap --output=test1.bag

 $ rwbagbuild --bag-input=mybag.txt --delimiter=,       \
        --key-type=sip-pmap --pmap-file=ip-map.pmap     \
        --output-path=test2.bag

The prefix map file is not stored as part of the Bag, so you must provide the name of the prefix map when running rwbagcat(1).

 $ rwbagcat --pmap-file=ip-map.pmap test2.bag
          internal|                 527|

For a prefix map file that maps by protocol-port pairs, the textual input must contain either three column (protocol, port, counter) or two columns (protocol and port) which uses the --default-counter.

 $ cat proto-port-count.txt
 6| 25|  800|
 6| 80| 5642|
 6| 22
 $ rwbagbuild --key-type=sport-pmap                 \
        --bag-input=proto-port-count.txt            \
        --pmap-file=proto-port-map.pmap             \
        --output-path=service.bag
 $ rwbagcat --pmap-file=port-map.pmap service.bag
   TCP/SSH|                   1|
  TCP/SMTP|                 800|
  TCP/HTTP|                5642|

A single value followed by an optional delimiter is treated as a key. The counter for those keys is set to 1. A delimiter may follow the count, and any text after that delimiter is ignored. When the counter is 0, the key is not inserted into the Bag.

 $ cat sport.txt
 0
 1|
 2|3
 4|5|
 6|7|8|
 9|10|||||
 11|0
 $ rwbagbuild --bag-input=sport.txt --key-type=sport \
   | rwbagcat
          0|                   1|
          1|                   1|
          2|                   3|
          4|                   5|
          6|                   7|
          9|                  10|

The --default-counter switch overrides the count.

 $ rwbagbuild --bag-input=sport.txt --key-type=sport --default-count=1 \
   | rwbagcat
          0|                   1|
          1|                   1|
          2|                   1|
          4|                   1|
          6|                   1|
          9|                   1|
         11|                   1|

In fact, the --default-counter switch causes rwbagbuild to ignore all text after the delimiter that follows the key.

 $ echo '12|13 14' | rwbagbuild --bag-input=- --output=/dev/null
 rwbagbuild: Error parsing line 1: Extra text after count
 rwbagbuild: Error creating bag from text bag

 $ echo '12|13 14' | rwbagbuild --bag-input=- --default-count=1 \
   | rwbagcat --key-format=decimal
         12|                   1|

SILK_COUNTRY_CODES
This environment variable allows the user to specify the country code mapping file that rwbagbuild uses when mapping an IP to a country for the "sip-country", "dip-country", or "any-country" keys. The value may be a complete path or a file relative to the SILK_PATH. See the "FILES" section for standard locations of this file.
SILK_CLOBBER
The SiLK tools normally refuse to overwrite existing files. Setting SILK_CLOBBER to a non-empty value removes this restriction.
SILK_COMPRESSION_METHOD
This environment variable is used as the value for --compression-method when that switch is not provided. Since SiLK 3.13.0.
SILK_PATH
This environment variable gives the root of the install tree. When searching for the country code mapping file, rwbagbuild may use this environment variable. See the "FILES" section for details.

$SILK_COUNTRY_CODES
$SILK_PATH/share/silk/country_codes.pmap
$SILK_PATH/share/country_codes.pmap
/usr/local/share/silk/country_codes.pmap
/usr/local/share/country_codes.pmap
Possible locations for the country code mapping file required by the "sip-country", "dip-country", and "any-country" key-types.

rwbag(1), rwbagcat(1), rwbagtool(1), rwfileinfo(1), rwpmapbuild(1), rwset(1), rwsetbuild(1), rwsetcat(1), rwsettool(1), ccfilter(3), silk(7), zlib(3), cat(1)

rwbagbuild should verify the key's value is within the allowed range for the specified --key-type.

rwbagbuild should accept non-numeric values for some fields, such as times and TCP flags.

The --default-count switch is poorly named.

2022-04-12 SiLK 3.19.1

Search for    or go to Top of page |  Section 1 |  Main Index

Powered by GSP Visit the GSP FreeBSD Man Page Interface.
Output converted with ManDoc.