NAME

silk.conf - SiLK site configuration file

DESCRIPTION

The "silk.conf" SiLK site configuration file is used to associate symbolic names with flow collection information stored in SiLK Flow records.

In addition to the information contained in the NetFlow or IPFIX flow record (e.g., source and destination addresses and ports, IP protocol, time stamps, data volume), every SiLK Flow record has two additional pieces of information that is added when rwflowpack(8) converts the NetFlow or IPFIX record to the SiLK format:

The sensor typically denotes the location where the flow data was collected; e.g., an organization that is instrumenting its border routers would create a sensor to represent each router. Each sensor has a unique name and numeric ID.
The flowtype represents information about how the flow was routed (e.g., as incoming or outgoing) or other information about the flow (e.g., web or non-web). The packing process categorizes each flow into a flowtype. Each flowtype has a unique name and numeric ID.

Note that the binary form of SiLK flow records represent the sensor and flowtype by their numeric IDs, not by their names.

For historic reasons, one rarely speaks of the flowtype of a SiLK Flow record, but instead refers to its class and type. Every flowtype maps to a unique class/type pair. The classes and types have names only; they do not have numeric IDs. Note that flowtype and type are different concepts despite the similarity of their names.

A class is generally used to represent topological features of the network with different collections of sensors, since every active sensor must belong by one or more classes. Every class must have a unique name.

A type is used to distinguish traffic within a single topological area based on some other dimension. For example, incoming and outgoing traffic is generally distinguished into different types. Web traffic is also frequently split into a separate type from normal traffic in order to partition the data better. The type names within a class must be unique, but multiple classes may have a type with the same name.

As stated above, each class/type pair maps to a unique flowtype.

The "silk.conf" file defines

the mapping between sensor names and sensor IDs
the names of the available classes
the sensors that belong to each class
the names of the types in each class
the mapping from a class/type pair to a flowtype ID
the mapping between a flowtype name and a flowtype ID
the default class to use for rwfilter(1) and rwfglob(1) queries
for each class, the default types to use for rwfilter and rwfglob
the layout of the directory tree for the SiLK archive relative to the root directory
a default value for the --packing-logic switch to rwflowpack(8)

In normal usage, the "silk.conf" file will be located at the root of the SiLK data spool referenced by the SILK_DATA_ROOTDIR environment variable, or specified on the command line using the --data-rootdir flag. This ensures that the sensor and class definitions in the site configuration match the data in the flow records you retrieve.

If you cannot place the site configuration file in the data root directory, or the file in that location is incorrect, you can use the SILK_CONFIG_FILE environment variable to specify the location of your configuration file (including the file name). Many SiLK commands provide the --site-config-file switch which allows you to specify the name of the site configuration file on the command line.

By having the site configuration information outside of the SiLK tools, a single SiLK installation can be used to query different data stores (though each invocation of a command can only query one storage location).

Any additions or modifications to the "silk.conf" file will be seen by all SiLK applications upon their next invocation. There are some important things to keep in mind when modifying the "silk.conf" file:

Once data has been collected for a sensor or a flowtype, the sensor or flowtype should never be removed or renumbered. SiLK Flow files store the sensor ID and flowtype ID as integers; removing or renumbering a sensor or flowtype breaks this mapping. In order to keep the mapping consistent, old sensor and flowtype definitions should remain indefinitely. Completely unused sensors or flowtypes may be removed, but the IDs of the remaining sensors and flowtypes must not be modified.
The path to the files in the SiLK data store often involve the sensor name, flowtype name, class name, and/or type name. If any of those names are changed, it will be necessary to rename all the previously packed data files that have the former name as part of their path.
If the SiLK installation at your site is distributed across multiple hosts (for example, if packing occurs on a machine separate from analysis), it is important to synchronize changes to the "silk.conf" files.
The packing logic plug-in file, packlogic-*.so (e.g., packlogic-twoway(3), packlogic-generic(3)), used by rwflowpack (8) checks for specific class names, type names, and flowtype names at start up, and it will exit with an error if the names it expects do not exist. In addition, it checks that the flowtype IDs it has match with those in the "silk.conf" file. When new flowtypes are added, the packlogic-*.so file will need to be updated if rwflowpack is to generate SiLK Flow records with the new flowtype.
When rwflowpack reads incoming flow records, those records are associated with a sensor name as determined by the sensor.conf (5) file. rwflowpack uses the "silk.conf" file to map the sensor name to the sensor ID, and it stores the sensor ID in the SiLK records it creates. Changes to the "silk.conf" and "sensor.conf" files may need to be coordinated.

SYNTAX

In the site configuration file, each line may be blank, or contain any amount of leading whitespace, which is ignored. At any location in a line, the character "#" indicates the beginning of a comment, which reaches until the end of the line. (If a literal "#" symbol is required in the argument of any command, it may be quoted as described below.) These comments are ignored.

Each non-empty line begins with a command name, followed by one or more arguments. Command names are a sequence of non-whitespace characters, not including the characters "#" or """ (see below for valid commands). Arguments may either be textual atoms (any sequence of non-whitespace characters, non-"#"-or-""" characters, including numerals and punctuation), or quoted strings. Quoted strings begin with the character """ and end with the character """, and allow for C-style backslash escapes in between. The character "#" inside a quoted string does not begin a comment, and whitespace is allowed inside a quoted string.

For the commands supported by "silk.conf" and described below, unless a command explicitly states that it is used by particular applications, it should be considered used by all of the SiLK analysis tools and the packing tools flowcap(8), rwflowpack(8), and rwflowappend(8).

There are three contexts for commands: top-level, class block, and group block contexts. The class block and group block contexts are used to describe individual features of classes and groups, while top-level commands are used to describe the entire configuration, and to define sensors.

The valid commands for each context are described below.

Top-Level Commands

class class-name

The "class" command begins a new class block. It takes as an argument the name of the class being defined. Each class must have a unique name. A class block is closed with the "end class" command. See below for a list of commands valid inside class blocks.

The class name must begin with a letter, must not be longer than 32 characters, and may not contain whitespace characters or the character slash ("/").

A site that does not use multiple classes should define a single class with a name like "all" or "default".

To be valid, a configuration file must contain at least one class definition.

Example: class all

default-class class-name

rwfilter(1) and rwfglob(1) will use a default class when the user does not specify an explicit --class. This command specifies that default class; the class must have been created prior to this command. If more than one default class is set, the last definition encountered is used.

Example: default-class all

group group-name

Sensor groups are a convenient way of defining named groupings of sensors for inclusion in classes. They cannot currently be used in the SiLK command-line tools, but only in the configuration file. The "group" command takes as an argument the group to be defined, and begins a group block. A group block is closed using the "end group" command. See below for details on valid commands within group blocks.

Example: group test-sensors

include "file-name"

The "include" command is used to include the contents of another file. This may be used to separate large configurations into logical units. (Note, however, that all sensors, classes, groups, and types must be declared before they may be referenced.)

Example: include "silk-2.conf"

packing-logic "file-name"

The "packing-logic" command provides a default value for the --packing-logic switch on rwflowpack(8). The value is the path to a plug-in that rwflowpack loads; the plug-in provides functions that determine into which class and type a flow record will be categorized. The path specified here will be ignored when the --packing-logic switch is explicitly specified to rwflowpack or when SiLK has been configured with hard-coded packing logic.

Example: packing-logic "packlogic-twoway.so"

path-format "format-string"

File and directory locations relative to the SILK_DATA_ROOTDIR may be defined using the "path-format" command. The "path-format" is used by rwflowpack and rwflowappend(8) when writing data to the data repository, and it is used by rwfilter and rwfglob when reading or listing files in the data repository. This command takes a format string specification that supports the following "%"-conversions:

%C: The textual class name
%F: The textual flowtype name for this class/type pair (see also %f)
%H: The hour (24-hour clock) as a two-digit, zero-padded number
%N: The textual sensor name (see also %n)
%T: The textual type name
%Y: The year as a four-digit, zero-padded number
%d: The day of the month as a two-digit, zero-padded number
%f: The flowtype ID, as an unpadded number (see also %F)
%m: The month of the year as a two-digit, zero-padded number
%n: The sensor ID, as an unpadded number (see also %N)
%x: The default file name, which is equivalent to "%F-%N_%Y%m%d.%H"
%%: A literal "%" character

A "%" followed by any other character is an error.

For example, to place all spooled files directly in the data root directory, the path format %x could be used. To use two levels of hierarchy, the first containing the year and month, and the second containing the day and sensor name, like "2006-01/23-alpha/...", the format would be "%Y-%m/%d-%N/%x".

If no path format is set by the configuration file, the default path format of "%T/%Y/%m/%d/%x" is used.

All path formats are currently required to end in "/%x" so that information may be extracted from the file name. This requirement may be lifted in the future.

Example: %C/%T/%Y/%m/%d/%x

sensor sensor-id sensor-name

sensor sensor-id sensor-name "sensor-description"

Individual sensor definitions are created with the "sensor" command. This command creates a new sensor with the given name and numeric ID. Sensor names must begin with a letter, must not be longer than 64 characters, and may not contain whitespace characters or the characters slash ("/") or underscore ("_").

The sensor line may may also provide an optional description of the sensor, enclosed in double quotes. The description can be used however your installation chooses to use it. The description may be viewed by specifying the "describe-sensor" field to rwsiteinfo(1). (When using sensor descriptions, the file's "version" must be 2.)

It is an error to define two different sensors with the same sensor ID or the same sensor name.

NOTE: It is extremely important not to change the sensor-id or sensor-name for a given sensor once that sensor is in use. The sensor-id field is stored numerically in SiLK data files, and the sensor-name field is used to construct file names within the data root directory.

Example: sensor 0 S001

Example: sensor 0 S001 "Primary connection to ISP"

version version-number

The "version" command declares that this configuration file conforms to a given version of the configuration file format. If the tools do not support this version of the configuration file, they will report an error. Currently, versions 1 and 2 of the format is defined, where version 2 indicates that sensor descriptions are present.

It is a recommended practice to include the version number at the beginning of all configuration files for compatibility with future versions.

Example: version 1

Class Block Commands

The commands inside a class block define the class's types, its default types, the sensors that belong to it, and the mapping from the class/type pair to the flowtype name and flowtype ID.

end class

The "end class" command ends the definition of a class. Following an "end class" command, top-level commands are again accepted.

Example: end class

default-types type-name ...

When no types are specified for the "rwfilter" or "rwfglob" commands, the default set of types for the selected class is used. Each of the types listed in this command is included as a default type of the class.

Example: default-types in inweb

sensors sensor-name-or-group-ref ...

The "sensors" command is used to associate sensors with a class. In short, to declare that these sensors have data for this class. Each item in the list must be either the name of a sensor or the name of a sensor group preceded by an at ("@") character. When you add a sensor group, it is the same as individually adding each sensor in that group to the class.

Example: sensors my-sensor-1 my-sensor-2 @my-group-1

type flowtype-id type-name [ flowtype-name ]

The "type" command defines a type name within the current class, and it specifies the flowtype ID to use for that class/type pair. In addition, the "type" command may specify a flowtype name. The flowtype ID and flowtype name must be unique across the entire "silk.conf" file (and any included files). If a flowtype name is not specified, a default flowtype name is constructed by concatenating the name of the class and the name of the type. (e.g. the type "in" in the class "all" would have a flowtype name of "allin".) Within a class, each type must have a unique name, but multiple classes may use the same type name. The type name and flowtype name must begin with a letter, must not be longer than 32 characters, and may not contain whitespace characters or the character slash ("/").

As with sensors, it is important to be careful when renumbering flowtype IDs or renaming types or flowtypes because the numeric IDs are stored in data files, and the textual names are used as portions of file and path names.

Example: type 0 in

Example: type 1 out out

Group Block Commands

A group block is a convenience used to define a list of sensors.