|
|
| |
pysilk(3) |
SiLK Tool Suite |
pysilk(3) |
This document describes the features of PySiLK, the SiLK Python
extension. It documents the objects and methods that allow one to read,
manipulate, and write SiLK Flow records, IPsets, Bags, and Prefix Maps (pmaps)
from within python(1). PySiLK may be used in a
stand-alone Python script or as a plug-in from within the SiLK tools
rwfilter(1), rwcut(1),
rwgroup(1), rwsort(1),
rwstats(1), and rwuniq(1). This
document describes the objects and methods that PySiLK provides; the details
of using those from within a plug-in are documented in the
silkpython(3) manual page.
The SiLK Python extension defines the following objects and
modules:
- IPAddr object
- Represents an IP Address.
- IPv4Addr object
- Represents an IPv4 Address.
- IPv6Addr object
- Represents an IPv6 Address.
- IPWildcard object
- Represents CIDR blocks or SiLK IP wildcard addresses.
- IPSet object
- Represents a SiLK IPset.
- PrefixMap object
- Represents a SiLK Prefix Map.
- Bag object
- Represents a SiLK Bag.
- TCPFlags object
- Represents TCP flags.
- RWRec object
- Represents a SiLK Flow record.
- SilkFile object
- Represents a channel for writing to or reading from SiLK Flow files.
- FGlob object
- Allows retrieval of filenames in a SiLK data store. See also the
silk.site module.
- silk.site module
- Defines several functions that relate to the SiLK site configuration and
allow iteration over the files in a SiLK data store.
- silk.plugin module
- Defines functions that may only be used in SiLK Python plug-ins.
The SiLK Python extension provides the following functions:
- silk.get_configuration(name=None)
- When name is None, return a dictionary whose keys specify
aspects of how SiLK was compiled. When name is provided, return the
dictionary value for that key, or None when name is an
unknown key. The dictionary's keys and their meanings are:
- COMPRESSION_METHODS
- A list of strings specifying the compression methods that were compiled
into this build of SiLK. The list will contain one or more of
"NO_COMPRESSION",
"ZLIB",
"LZO1X", and/or
"SNAPPY".
- INITIAL_TCPFLAGS_ENABLED
- True if SiLK was compiled with support for initial TCP flags;
False otherwise.
- IPV6_ENABLED
- True if SiLK was compiled with IPv6 support; False
otherwise.
- SILK_VERSION
- The version of SiLK linked with PySiLK, as a string.
- TIMEZONE_SUPPORT
- The string "UTC" if SiLK was compiled to
use UTC, or the string "local" if SiLK
was compiled to use the local timezone.
- silk.ipv6_enabled()
- Return True if SiLK was compiled with IPv6 support, False
otherwise.
- silk.initial_tcpflags_enabled()
- Return True if SiLK was compiled with support for initial TCP
flags, False otherwise.
- silk.init_country_codes(filename=None)
- Initialize PySiLK's country code database. filename should be the
path to a country code prefix map, as created by
rwgeoip2ccmap (1). If filename is not
supplied, SiLK will look first for the file specified by
$SILK_COUNTRY_CODES, and then for a file named
country_codes.pmap in $SILK_PATH/share/silk,
$SILK_PATH/share, /usr/local/share/silk, and
/usr/local/share. (The latter two assume that SiLK was installed in
/usr/local.) Will throw a RuntimeError if loading the
country code prefix map fails.
- silk.silk_version()
- Return the version of SiLK linked with PySiLK, as a string.
An IPAddr object represents an IPv4 or IPv6 address. These two types of
addresses are represented by two subclasses of IPAddr: IPv4Addr
and IPv6Addr.
- class silk.IPAddr(address)
- The constructor takes a string address, which must be a string
representation of either an IPv4 or IPv6 address, or an IPAddr object.
IPv6 addresses are only accepted if
silk.ipv6_enabled() returns True. The
IPAddr object that the constructor returns will be either an
IPv4Addr object or an IPv6Addr object.
For compatibility with releases prior to SiLK 2.2.0, the
IPAddr constructor will also accept an integer address, in
which case it converts that integer to an IPv4Addr object. This
behavior is deprecated. Use the IPv4Addr and IPv6Addr
constructors instead.
Examples:
>>> addr1 = IPAddr('192.160.1.1')
>>> addr2 = IPAddr('2001:db8::1428:57ab')
>>> addr3 = IPAddr('::ffff:12.34.56.78')
>>> addr4 = IPAddr(addr1)
>>> addr5 = IPAddr(addr2)
>>> addr6 = IPAddr(0x10000000) # Deprecated as of SiLK 2.2.0
Supported operations and methods:
- Inequality Operations
- In all the below inequality operations, whenever an IPv4 address is
compared to an IPv6 address, the IPv4 address is converted to an IPv6
address before comparison. This means that IPAddr("0.0.0.0")
== IPAddr("::ffff:0.0.0.0").
- addr1 == addr2
- Return True if addr1 is equal to addr2; False
otherwise.
- addr1 != addr2
- Return False if addr1 is equal to addr2; True
otherwise.
- addr1 < addr2
- Return True if addr1 is less than addr2; False
otherwise.
- addr1 <= addr2
- Return True if addr1 is less than or equal to addr2;
False otherwise.
- addr1 >= addr2
- Return True if addr1 is greater than or equal to
addr2; False otherwise.
- addr1 > addr2
- Return True if addr1 is greater than addr2;
False otherwise.
- addr.is_ipv6()
- Return True if addr is an IPv6 address, False
otherwise.
- addr.isipv6()
- (DEPRECATED in SiLK 2.2.0) An alias for
is_ipv6().
- addr.to_ipv6()
- If addr is an IPv6Addr, return a copy of addr.
Otherwise, return a new IPv6Addr mapping addr into the
::ffff:0:0/96 prefix.
- addr.to_ipv4()
- If addr is an IPv4Addr, return a copy of addr. If
addr is in the ::ffff:0:0/96 prefix, return a new IPv4Addr
containing the IPv4 address. Otherwise, return None.
- int(addr)
- Return the integer representation of addr. For an IPv4 address,
this is a 32-bit number. For an IPv6 address, this is a 128-bit
number.
- str(addr)
- Return a human-readable representation of addr in its canonical
form.
- addr.padded()
- Return a human-readable representation of addr which is fully
padded with zeroes. With IPv4, it will return a string of the form
"xxx.xxx.xxx.xxx". With IPv6, it will return a string of the
form "xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx".
- addr.octets()
- Return a tuple of integers representing the octets of addr. The
tuple's length is 4 for an IPv4 address and 16 for an IPv6 address.
- addr.mask(mask)
- Return a copy of addr masked by the IPAddr mask.
When both addresses are either IPv4 or IPv6, applying the mask
is straightforward.
If addr is IPv6 but mask is IPv4, mask is
converted to IPv6 and then the mask is applied. This may result in an
odd result.
If addr is IPv4 and mask is IPv6, addr
will remain an IPv4 address if masking mask with
"::ffff:0000:0000" results in
"::ffff:0000:0000", (namely, if bytes
10 and 11 of mask are 0xFFFF). Otherwise, addr is
converted to an IPv6 address and the mask is performed in IPv6 space,
which may result in an odd result.
- addr.mask_prefix(prefix)
- Return a copy of addr masked by the high prefix bits. All
bits below the prefixth bit will be set to zero. The maximum value
for prefix is 32 for an IPv4Addr, and 128 for an
IPv6Addr.
- addr.country_code()
- Return the two character country code associated with addr. If no
country code is associated with addr, return None. The
country code association is initialized by the
silk.init_country_codes() function. If
init_country_codes() is not called before calling
this method, it will act as if init_country_codes()
was called with no argument.
An IPv4Addr object represents an IPv4 address. IPv4Addr is a
subclass of IPAddr, and supports all operations and methods that
IPAddr supports.
- class silk.IPv4Addr(address)
- The constructor takes a string address, which must be a string
representation of IPv4 address, an IPAddr object, or an integer. A
string will be parsed as an IPv4 address. An IPv4Addr object will
be copied. An IPv6Addr object will be converted to an IPv4 address,
or throw a ValueError if the conversion is not possible. A 32-bit
integer will be converted to an IPv4 address.
Examples:
>>> addr1 = IPv4Addr('192.160.1.1')
>>> addr2 = IPv4Addr(IPAddr('::ffff:12.34.56.78'))
>>> addr3 = IPv4Addr(addr1)
>>> addr4 = IPv4Addr(0x10000000)
An IPv6Addr object represents an IPv6 address. IPv6Addr is a
subclass of IPAddr, and supports all operations and methods that
IPAddr supports.
- class silk.IPv6Addr(address)
- The constructor takes a string address, which must be a string
representation of either an IPv6 address, an IPAddr object, or an
integer. A string will be parsed as an IPv6 address. An IPv6Addr
object will be copied. An IPv4Addr object will be converted to an
IPv6 address. A 128-bit integer will be converted to an IPv6 address.
Examples:
>>> addr1 = IPAddr('2001:db8::1428:57ab')
>>> addr2 = IPv6Addr(IPAddr('192.160.1.1'))
>>> addr3 = IPv6Addr(addr1)
>>> addr4 = IPv6Addr(0x100000000000000000000000)
An IPWildcard object represents a range or block of IP addresses. The
IPWildcard object handles iteration over IP addresses with for
x in wildcard.
- class silk.IPWildcard(wildcard)
- The constructor takes a string representation wildcard of the
wildcard address. The string wildcard can be an IP address, an IP
with a CIDR notation, an integer, an integer with a CIDR designation, or
an entry in SiLK wildcard notation. In SiLK wildcard notation, a wildcard
is represented as an IP address in canonical form with each octet (IPv4)
or hexadectet (IPv6) represented by one of following: a value, a range of
values, a comma separated list of values and ranges, or the character 'x'
used to represent the entire octet or hexadectet. IPv6 wildcard addresses
are only accepted if silk.ipv6_enabled() returns
True. The wildcard element can also be an IPWildcard, in
which case a duplicate reference is returned.
Examples:
>>> a = IPWildcard('1.2.3.0/24')
>>> b = IPWildcard('ff80::/16')
>>> c = IPWildcard('1.2.3.4')
>>> d = IPWildcard('::ffff:0102:0304')
>>> e = IPWildcard('16909056')
>>> f = IPWildcard('16909056/24')
>>> g = IPWildcard('1.2.3.x')
>>> h = IPWildcard('1:2:3:4:5:6:7.x')
>>> i = IPWildcard('1.2,3.4,5.6,7')
>>> j = IPWildcard('1.2.3.0-255')
>>> k = IPWildcard('::2-4')
>>> l = IPWildcard('1-2:3-4:5-6:7-8:9-a:b-c:d-e:0-ffff')
>>> m = IPWildcard(a)
Supported operations and methods:
- addr in wildcard
- Return True if addr is in wildcard, False
otherwise.
- addr not in wildcard
- Return False if addr is in wildcard, True
otherwise.
- string in wildcard
- Return the result of IPAddr(string) in
wildcard.
- string not in wildcard
- Return the result of IPAddr(string) not in
wildcard.
- wildcard.is_ipv6()
- Return True if wildcard contains IPv6 addresses,
False otherwise.
- str(wildcard)
- Return the string that was used to construct wildcard.
An IPSet object represents a set of IP addresses, as produced by
rwset(1) and rwsetbuild(1). The
IPSet object handles iteration over IP addresses with for
x in set, and iteration over
CIDR blocks using for x in
set. cidr_iter().
In the following documentation, and ip_iterable can be any
of:
- an IPAddr object representing an IP address
- the string representation of a valid IP address
- an IPWildcard object
- the string representation of an IPWildcard
- an iterable of any combination of the above
- another IPSet object
- class silk.IPSet([ip_iterable])
- The constructor creates an empty IPset. If an ip_iterable is
supplied as an argument, each member of ip_iterable will be added
to the IPset.
Other constructors, all class methods:
- silk.IPSet.load(path)
- Create an IPSet by reading a SiLK IPset file. path must be a
valid location of an IPset.
Other class methods:
- silk.IPSet.supports_ipv6()
- Return whether this implementation of IPsets supports IPv6 addresses.
Supported operations and methods:
In the lists of operations and methods below,
- set is an IPSet object
- addr can be an IPAddr object or the string representation of
an IP address.
- set2 is an IPSet object. The operator versions of the
methods require an IPSet object.
- ip_iterable is an iterable over IP addresses as accepted by the
IPSet constructor. Consider ip_iterable as creating a
temporary IPSet to perform the requested method.
The following operations and methods do not modify the
IPSet:
- set.cardinality()
- Return the cardinality of set.
- len(set)
- Return the cardinality of set. In Python 2.x, this method will
raise OverflowError if the number of IPs in the set cannot be
represented by Python's Plain Integer type--that is, if the value is
larger than "sys.maxint". The
cardinality() method will not raise this
exception.
- set.is_ipv6()
- Return True if set is a set of IPv6 addresses, and
False if it a set of IPv4 addresses. For the purposes of this
method, IPv4-in-IPv6 addresses (that is, addresses in the ::ffff:0:0/96
prefix) are considered IPv6 addresses.
- addr in set
- Return True if addr is a member of set; False
otherwise.
- addr not in set
- Return False if addr is a member of set; True
otherwise.
- set.copy()
- Return a new IPSet with a copy of set.
- set.issubset(ip_iterable)
- set <= set2
- Return True if every IP address in set is also in
set2. Return False otherwise.
- set.issuperset(ip_iterable)
- set >= set2
- Return True if every IP address in set2 is also in
set. Return False otherwise.
- set.union(ip_iterable[, ...])
- set | other | ...
- Return a new IPset containing the IP addresses in set and
all others.
- set.intersection(ip_iterable[, ...])
- set & other & ...
- Return a new IPset containing the IP addresses common to set
and others.
- set.difference(ip_iterable[, ...])
- set - other - ...
- Return a new IPset containing the IP addresses in set but not in
others.
- set.symmetric_difference(ip_iterable)
- set ^ other
- Return a new IPset containing the IP addresses in either set or in
other but not in both.
- set.isdisjoint(ip_iterable)
- Return True when none of the IP addresses in ip_iterable are
present in set. Return False otherwise.
- set.cidr_iter()
- Return an iterator over the CIDR blocks in set. Each iteration
returns a 2-tuple, the first element of which is the first IP address in
the block, the second of which is the prefix length of the block. Can be
used as for (addr, prefix) in
set.cidr_iter().
- set.save(filename, compression=DEFAULT)
- Save the contents of set in the file filename. The
compression determines the compression method used when outputting
the file. Valid values are the same as those in
silk.silkfile_open().
The following operations and methods will modify the
IPSet:
- set.add(addr)
- Add addr to set and return set. To add multiple IP
addresses, use the add_range() or
update() methods.
- set.discard(addr)
- Remove addr from set if addr is present; do nothing
if it is not. Return set. To discard multiple IP addresses, use the
difference_update() method. See also the
remove() method.
- set.remove(addr)
- Similar to discard(), but raise KeyError if
addr is not a member of set.
- set.pop()
- Remove and return an arbitrary address from set. Raise
KeyError if set is empty.
- set.clear()
- Remove all IP addresses from set and return set.
- set.convert(version)
- Convert set to an IPv4 IPset if version is 4 or to an IPv6
IPset if version is 6. Return set. Raise ValueError
if version is not 4 or 6. If version is 4 and set
contains IPv6 addresses outside of the ::ffff:0:0/96 prefix, raise
ValueError and leave set unchanged.
- set.add_range(start, end)
- Add all IP addresses between start and end, inclusive, to
set. Raise ValueError if end is less than
start.
- set.update(ip_iterable[, ...])
- set |= other | ...
- Add the IP addresses specified in others to set; the result
is the union of set and others.
- set.intersection_update(ip_iterable[, ...])
- set &= other & ...
- Remove from set any IP address that does not appear in
others; the result is the intersection of set and
others.
- set.difference_update(ip_iterable[, ...])
- set -= other | ...
- Remove from set any IP address found in others; the result
is the difference of set and others.
- set.symmetric_difference_update(ip_iterable)
- set ^= other
- Update set, keeping the IP addresses found in set or in
other but not in both.
An RWRec object represents a SiLK Flow record.
- class
silk.RWRec([rec],[field=value],...)
- This constructor creates an empty RWRec object. If an RWRec
rec is supplied, the constructor will create a copy of it. The
variable rec can be a dictionary, such as that supplied by the
as_dict() method. Initial values for record fields
can be included.
Example:
>>> recA = RWRec(input=10, output=20)
>>> recB = RWRec(recA, output=30)
>>> (recA.input, recA.output)
(10, 20)
>>> (recB.input, recB.output)
(10, 30)
Instance attributes:
Accessing or setting attributes on an RWRec whose
descriptions mention functions in the silk.site module causes the
silk.site.init_site() function to be called with no
argument if it has not yet been called successfully---that is, if
silk.site.have_site_config() returns False.
- rec.application
- The service port of the flow rec as set by the flow meter if
the meter supports it, a 16-bit integer. The yaf(1)
flow meter refers to this value as the appLabel. The default
application value is 0.
- rec.bytes
- The count of the number of bytes in the flow rec, a 32-bit integer.
The default bytes value is 0.
- rec.classname
- (READ ONLY) The class name assigned to the flow rec, a string. This
value is first member of the tuple returned by the
"rec.classtype" attribute, which see.
- rec.classtype
- A 2-tuple containing the classname and the typename of the flow
rec. Getting the value returns the result of
silk.site.classtype_from_id(rec.classtype_id). If
that function throws an error, the result is a 2-tuple containing the
string "?" and a string representation
of "rec.classtype_id". Setting the value to
(class,type) sets rec.classtype_id to the result of
silk.site.classtype_id(class,type). If that
function throws an error because the (class,type) pair is
unknown, rec is unchanged and ValueError is thrown.
- rec.classtype_id
- The ID for the class and type of the flow rec, an 8-bit integer.
The default classtype_id value is 255. Changes to this value are reflected
in the "rec.classtype" attribute. The classtype_id
attribute may be set to a value that is considered invalid by the
silk.site.
- rec.dip
- The destination IP of the flow rec, an IPAddr object. The
default dip value is IPAddr('0.0.0.0'). May be set using a string
containing a valid IP address.
- rec.dport
- The destination port of the flow rec, a 16-bit integer. The default
dport value is 0. Since the destination port field is also used to store
the values for the ICMP type and code, setting this value may modify
rec.icmptype and rec.icmpcode.
- rec.duration
- The duration of the flow rec, a datetime.timedelta object.
The default duration value is 0. Changing the rec.duration
attribute will modify the rec.etime attribute such that
(rec.etime - rec.stime) == the new rec.duration. The
maximum possible duration is datetime.timedelta(milliseconds=0xffffffff).
See also rec.duration_secs.
- rec.duration_secs
- The duration of the flow rec in seconds, a float that includes
fractional seconds. The default duration_secs value is 0. Changing the
rec.duration_secs attribute will modify the rec.etime
attribute in the same way as changing rec.duration. The maximum
possible duration_secs value is 4294967.295.
- rec.etime
- The end time of the flow rec, a datetime.datetime object.
The default etime value is the UNIX epoch time,
datetime.datetime(1970,1,1,0,0). Changing the rec.etime attribute
modifies the flow record's duration. If the new duration would become
negative or would become larger than RWRec supports, a
ValueError will be raised. See also
rec.etime_epoch_secs.
- rec.etime_epoch_secs
- The end time of the flow rec as a number of seconds since the epoch
time, a float that includes fractional seconds. Epoch time is 1970-01-01
00:00:00 UTC. The default etime_epoch_secs value is 0. Changing the
rec.etime_epoch_secs attribute modifies the flow record's duration.
If the new duration would become negative or would become larger than
RWRec supports, a ValueError will be raised.
- rec.initial_tcpflags
- The TCP flags on the first packet of the flow rec, a
TCPFlags object. The default initial_tcpflags value is None.
The rec.initial_tcpflags attribute may be set to a new
TCPFlags object, or a string or number which can be converted to a
TCPFlags object by the TCPFlags() constructor.
Setting rec.initial_tcpflags when rec.session_tcpflags is
None sets the latter to TCPFlags(''). Setting
rec.initial_tcpflags or rec.session_tcpflags sets
rec.tcpflags to the binary OR of their values. Trying to set
rec.initial_tcpflags when rec.protocol is not 6 (TCP) will
raise an AttributeError.
- rec.icmpcode
- The ICMP code of the flow rec, an 8-bit integer. The default
icmpcode value is 0. The value is only meaningful when rec.protocol
is ICMP (1) or when rec.is_ipv6() is
True and rec.protocol is ICMPv6 (58). Since a record's ICMP
type and code are stored in the destination port, setting this value may
modify rec.dport.
- rec.icmptype
- The ICMP type of the flow rec, an 8-bit integer. The default
icmptype value is 0. The value is only meaningful when rec.protocol
is ICMP (1) or when rec.is_ipv6() is
True and rec.protocol is ICMPv6 (58). Since a record's ICMP
type and code are stored in the destination port, setting this value may
modify rec.dport.
- rec.input
- The SNMP interface where the flow rec entered the router or the
vlanId if the packing tools are configured to capture it (see
sensor.conf(5)), a 16-bit integer. The default input
value is 0.
- rec.nhip
- The next-hop IP of the flow rec as set by the router, an
IPAddr object. The default nhip value is IPAddr('0.0.0.0'). May be
set using a string containing a valid IP address.
- rec.output
- The SNMP interface where the flow rec exited the router or the
postVlanId if the packing tools are configured to capture it (see
sensor.conf(5)), a 16-bit integer. The default output
value is 0.
- rec.packets
- The packet count for the flow rec, a 32-bit integer. The default
packets value is 0.
- rec.protocol
- The IP protocol of the flow rec, an 8-bit integer. The default
protocol value is 0. Setting rec.protocol to a value other than 6
(TCP) causes rec.initial_tcpflags and rec.session_tcpflags
to be set to None.
- rec.sensor
- The name of the sensor where the flow rec was collected, a string.
Getting the value returns the result of
silk.site.sensor_from_id(rec.sensor_id). If that
function throws an error, the result is a string representation of
"rec.sensor_id" or the string
"?" when sensor_id is 65535. Setting the
value to sensor_name sets rec.sensor_id to the result of
silk.site.sensor_id(sensor_name). If that function
throws an error because sensor_name is unknown, rec is
unchanged and ValueError is thrown.
- rec.sensor_id
- The ID of the sensor where the flow rec was collected, a 16-bit
integer. The default sensor_id value is 65535. Changes to this value are
reflected in the "rec.sensor" attribute. The sensor_id
attribute may be set to a value that is considered invalid by
silk.site.
- rec.session_tcpflags
- The union of the flags of all but the first packet in the flow rec,
a TCPFlags object. The default session_tcpflags value is
None. The rec.session_tcpflags attribute may be set to a new
TCPFlags object, or a string or number which can be converted to a
TCPFlags object by the TCPFlags() constructor.
Setting rec.session_tcpflags when rec.initial_tcpflags is
None sets the latter to TCPFlags(''). Setting
rec.initial_tcpflags or rec.session_tcpflags sets
rec.tcpflags to the binary OR of their values. Trying to set
rec.session_tcpflags when rec.protocol is not 6 (TCP) will
raise an AttributeError.
- rec.sip
- The source IP of the flow rec, an IPAddr object. The default
sip value is IPAddr('0.0.0.0'). May be set using a string containing a
valid IP address.
- rec.sport
- The source port of the flow rec, an integer. The default sport
value is 0.
- rec.stime
- The start time of the flow rec, a datetime.datetime object.
The default stime value is the UNIX epoch time,
datetime.datetime(1970,1,1,0,0). Modifying the rec.stime attribute
will modify the flow's end time such that rec.duration is constant.
The maximum possible stime is 2038-01-19 03:14:07 UTC. See also
rec.etime_epoch_secs.
- rec.stime_epoch_secs
- The start time of the flow rec as a number of seconds since the
epoch time, a float that includes fractional seconds. Epoch time is
1970-01-01 00:00:00 UTC. The default stime_epoch_secs value is 0. Changing
the rec.stime_epoch_secs attribute will modify the flow's end time
such that rec.duration is constant. The maximum possible
stime_epoch_secs is 2147483647 (2^31-1).
- rec.tcpflags
- The union of the TCP flags of all packets in the flow rec, a
TCPFlags object. The default tcpflags value is
TCPFlags(' '). The rec.tcpflags attribute may be set to a
new TCPFlags object, or a string or number which can be converted
to a TCPFlags object by the TCPFlags()
constructor. Setting rec.tcpflags sets rec.initial_tcpflags
and rec.session_tcpflags to None. Setting
rec.initial_tcpflags or rec.session_tcpflags changes
rec.tcpflags to the binary OR of their values.
- rec.timeout_killed
- Whether the flow rec was closed early due to timeout by the
collector, a boolean. The default timeout_killed value is
False.
- rec.timeout_started
- Whether the flow rec is a continuation from a timed-out flow, a
boolean. The default timeout_started value is False.
- rec.typename
- (READ ONLY) The type name of the flow rec, a string. This value is
second member of the tuple returned by the
"rec.classtype" attribute, which see.
- rec.uniform_packets
- Whether the flow rec contained only packets of the same size, a
boolean. The default uniform_packets value is False.
Supported operations and methods:
- rec.is_icmp()
- Return True if the protocol of rec is 1 (ICMP) or if the
protocol of rec is 58 (ICMPv6) and
rec.is_ipv6() is True. Return
False otherwise.
- rec.is_ipv6()
- Return True if rec contains IPv6 addresses, False
otherwise.
- rec.is_web()
- Return True if rec can be represented as a web record,
False otherwise. A record can be represented as a web record if the
protocol is TCP (6) and either the source or destination port is one of
80, 443, or 8080.
- rec.as_dict()
- Return a dictionary representing the contents of rec. Implicitly
calls silk.site.init_site() with no arguments if
silk.site.have_site_config() returns
False.
- rec.to_ipv4()
- Return a new copy of rec with the IP addresses (sip, dip, and nhip)
converted to IPv4. If any of these addresses cannot be converted to IPv4,
(that is, if any address is not in the ::ffff:0:0/96 prefix) return
None.
- rec.to_ipv6()
- Return a new copy of rec with the IP addresses (sip, dip, and nhip)
converted to IPv6. Specifically, the function maps the IPv4 addresses into
the ::ffff:0:0/96 prefix.
- str(rec)
- Return the string representation of
rec.as_dict().
- rec1 == rec2
- Return True if rec1 is structurally equivalent to
rec2. Return False otherwise.
- rec1 != rec2
- Return True if rec1 is not structurally equivalent to
rec2 Return False otherwise.
A SilkFile object represents a channel for writing to or reading from
SiLK Flow files. A SiLK file open for reading can be iterated over using
for rec in file.
Creation functions:
- silk.silkfile_open(filename, mode,
compression=DEFAULT, notes=[],
invocations=[])
- This function takes a filename, a mode, and a set of optional keyword
parameters. It returns a SilkFile object. The mode should be
one of the following constant values:
- silk.READ
- Open file for reading
- silk.WRITE
- Open file for writing
- silk.APPEND
- Open file for appending
The filename should be the path to the file to open. A few
filenames are treated specially. The filename stdin maps to the
standard input stream when the mode is READ. The filenames
stdout and stderr map to the standard output and standard
error streams respectively when the mode is WRITE. A filename
consisting of a single hyphen (-) maps to the standard input if the
mode is READ, and to the standard output if the mode is
WRITE.
The compression parameter may be one of the following
constants. (This list assumes SiLK was built with the required libraries. To
check which compression methods are available at your site, see
silk.get_configuration("COMPRESSION_METHODS")).
- silk.DEFAULT
- Use the default compression scheme compiled into SiLK.
- silk.NO_COMPRESSION
- Use no compression.
- silk.ZLIB
- Use zlib block compression (as used by gzip(1)).
- silk.LZO1X
- Use lzo1x block compression.
- silk.SNAPPY
- Use snappy block compression.
If notes or invocations are set, they should be list
of strings. These add annotation and invocation headers to the file. These
values are visible by the rwfileinfo(1) program.
Examples:
>>> myinputfile = silkfile_open('/path/to/file', READ)
>>> myoutputfile = silkfile_open('/path/to/file', WRITE,
compression=LZO1X,
notes=['My output file',
'another annotation'])
- silk.silkfile_fdopen(fileno, mode,
filename=None,
compression=DEFAULT, notes=[],
invocations=[])
- This function takes an integer file descriptor, a mode, and a set of
optional keyword parameters. It returns a SilkFile object. The
filename parameter is used to set the value of the name
attribute of the resulting object. All other parameters work as described
in the silk.silkfile_open() function.
Deprecated constructor:
- class silk.SilkFile(filename, mode,
compression=DEFAULT, notes=[],
invocations=[])
- This constructor creates a SilkFile object. The parameters are
identical to those used by the silkfile_open()
function. This constructor is deprecated as of SiLK 3.0.0. For future
compatibility, please use the silkfile_open()
function instead of the SilkFile() constructor to
create SilkFile objects.
Instance attributes:
- file.name
- The filename that was used to create file.
- file.mode
- The mode that was used to create file. Valid values are
READ, WRITE, or APPEND.
Instance methods:
- file.read()
- Return an RWRec representing the next record in the SilkFile
file. If there are no records left in the file, return
None.
- file.write(rec)
- Write the RWRec rec to the SilkFile file.
Return None.
- file.next()
- A SilkFile object is its own iterator. For example,
iter(file) returns file. When the
SilkFile is used as an iterator, the next()
method is called repeatedly. This method returns the next record, or
raises StopIteration once the end of file is reached
- file.skip(count)
- Skip the next count records in file and return the number of
records skipped. If the return value is less than count, the end of
the file has been reached. At end of file, return 0. Since SiLK
3.19.1.
- file.notes()
- Return the list of annotation headers for the file as a list of
strings.
- file.invocations()
- Return the list of invocation headers for the file as a list of
strings.
- file.close()
- Close the file and return None.
A PrefixMap object represents an immutable mapping from IP addresses or
protocol/port pairs to labels. PrefixMap objects are created from SiLK
prefix map files as created by rwpmapbuild(1).
- class silk.PrefixMap(filename)
- The constructor creates a prefix map initialized from the filename.
The PrefixMap object will be of one of the two subtypes of
PrefixMap: an AddressPrefixMap or a
ProtoPortPrefixMap.
Supported operations and methods:
- pmap[key]
- Return the string label associated with key in pmap.
key must be of the correct type: either an IPAddr if
pmap is an AddressPrefixMap, or a 2-tuple of integers
(protocol, port), if pmap is a
ProtoPortPrefixMap. The method raises TypeError when the
type of the key is incorrect.
- pmap.get(key, default=None)
- Return the string label associated with key in pmap. Return
the value default if key is not in pmap, or if
key is of the wrong type or value to be a key for pmap.
- pmap.values()
- Return a tuple of the labels defined by the PrefixMap
pmap.
- pmap.iterranges()
- Return an iterator that will iterate over ranges of contiguous values with
the same label. The return values of the iterator will be the 3-tuple
(start, end, label), where start
is the first element of the range, end is the last element of the
range, and label is the label for that range.
A Bag object is a representation of a multiset. Each key represents a
potential element in the set, and the key's value represents the number of
times that key is in the set. As such, it is also a reasonable representation
of a mapping from keys to integers.
Please note, however, that despite its set-like properties,
Bag objects are not nearly as efficient as IPSet objects when
representing large contiguous ranges of key data.
In PySiLK, the Bag object is designed to look and act
similar to Python dictionary objects, and in many cases Bags and
dicts can be used interchangeably. There are differences, however,
the primary of which is that bag[key] returns a
value for all values in the key range of the bag. That value will be an
integer zero for all key values that have not been incremented.
- class silk.Bag(mapping=None,
key_type=None, key_len=None,
counter_type=None,
counter_len=None)
- The constructor creates a bag. All arguments are optional, and can be used
as keyword arguments.
If mapping is included, the bag is initialized from
that mapping. Valid mappings are:
- a Bag
- a key/value dictionary
- an iterable of key/value pairs
The key_type and key_len arguments describe the key
field of the bag. The key_type should be a string from the list of
valid types below. The key_len should be an integer describing the
number of bytes that will represent values of key_type. The
key_type argument is case-insensitive.
If key_type is not specified, it defaults to 'any-ipv6',
unless silk.ipv6_enabled() is False, in which
case the default is 'any-ipv4'. The one exception to this is when
key_type is not specified, but key_len is specified with a
value of less than 16. In this case, the default type is 'custom'.
Note: Key types that specify IPv6 addresses are not valid
if silk.ipv6_enabled() returns False. An error
will be thrown if they are used in this case.
If key_len is not specified, it defaults to the default
number of bytes for the given key_type (which can be determined by
the chart below). If specified, key_len must be one of the following
integers: 1, 2, 4, 16.
The counter_type and counter_len arguments describe
the counter value of the bag. The counter_type should be a string
from the list of valid types below. The counter_len should be an
integer describing the number of bytes that will represent valid of
counter_type. The counter_type argument is case
insensitive.
If counter_type is not specified, it defaults to
'custom'.
If counter_len is not specified, it defaults to 8.
Currently, 8 is the only valid value of counter_len.
Here is the list of valid key and counter types, along with their
default key_len values:
- 'sIPv4', 4
- 'dIPv4', 4
- 'sPort', 2
- 'dPort', 2
- 'protocol', 1
- 'packets', 4
- 'bytes', 4
- 'flags', 1
- 'sTime', 4
- 'duration', 4
- 'eTime', 4
- 'sensor', 2
- 'input', 2
- 'output', 2
- 'nhIPv4', 4
- 'initialFlags', 1
- 'sessionFlags', 1
- 'attributes', 1
- 'application', 2
- 'class', 1
- 'type', 1
- 'icmpTypeCode', 2
- 'sIPv6', 16
- 'dIPv6', 16
- 'nhIPv6', 16
- 'records', 4
- 'sum-packets', 4
- 'sum-bytes', 4
- 'sum-duration', 4
- 'any-ipv4', 4
- 'any-ipv6', 16
- 'any-port', 2
- 'any-snmp', 2
- 'any-time', 4
- 'custom', 4
Deprecation Notice: For compatibility with SiLK 2.x, the
key_type argument may be a Python class. An object of the
key_type class must be constructable from an integer, and it must
possess an __int__() method which retrieves that integer from the
object. Regardless of the maximum integer value supported by the
key_type class, internally the bag will store the keys as type
'custom' with length 4.
Other constructors, all class methods:
- silk.Bag.ipaddr(mapping,
counter_type=None,
counter_len=None)
- Creates a Bag using 'any-ipv6' as the key type (or 'any-ipv4' if
silk.ipv6_enabled() is False).
counter_type and counter_len are used as in the standard
Bag constructor. Equivalent to
Bag(mapping).
- silk.Bag.integer(mapping,
key_len=None, counter_type=None,
counter_len=None)
- Creates a Bag using 'custom' as the key_type (integer bag).
key_len, counter_type, and counter_len are used as in
the standard Bag constructor. Equivalent to
Bag(mapping, key_type='custom').
- silk.Bag.load(path,
key_type=None)
- Creates a Bag by reading a SiLK bag file. path must be a
valid location of a bag. When present, the key_type argument is
used as in the Bag constructor, ignoring the key type specified in
the bag file. When key_type is not provided and the bag file does
not contain type information, the key is set to 'custom' with a length of
4.
- silk.Bag.load_ipaddr(path)
- Creates an IP address bag from a SiLK bag file. Equivalent to
Bag.load(
path, key_type = IPv4Addr).
This constructor is deprecated as of SiLK 3.2.0.
- silk.Bag.load_integer(path)
- Creates an integer bag from a SiLK bag file. Equivalent to
Bag.load(
path, key_type = int).
This constructor is deprecated as of SiLK 3.2.0.
Constants:
- silk.BAG_COUNTER_MAX
- This constant contains the maximum possible value for Bag counters.
Other class methods:
- silk.Bag.field_types()
- Returns a tuple of strings which are valid key_type or
counter_type values.
- silk.Bag.type_merge(type_a,
type_b)
- Given two types from Bag.field_types(), returns the
type that would be given (by default) to a bag that is a result of the
co-mingling of two bags of the given types. For example:
Bag.type_merge('sport','dport') == 'any-port'.
Supported operations and methods:
In the lists of operations and methods below,
- bag and bag2 are Bag objects
- key and key2 are IPAddrs for bags that contain IP
addresses, or integers for other bags
- value and value2 are integers which represent the counter
associated a key in the bag
- ipset is an IPSet object
- ipwildcard is an IPWildcard object
The following operations and methods do not modify the
Bag:
- bag.get_info()
- Return information about the keys and counters of the bag. The return
value is a dictionary with the following keys and values:
- 'key_type'
- The current key type, as a string.
- 'key_len'
- The current key length in bytes.
- 'counter_type'
- The current counter type, as a string.
- 'counter_len'
- The current counter length in bytes.
The keys have the same names as the keyword arguments to the bag
constructor. As a result, a bag with the same key and value information as
an existing bag can be generated by using the following idiom:
Bag(**bag.get_info()).
- bag.copy()
- Return a new Bag which is a copy of bag.
- bag[key]
- Return the counter value associated with key in bag.
- bag[key:key2] or
bag[key,key2,...]
- Return a new Bag which contains only the elements in the key range
[key, key2), or a new Bag containing only the given
elements in the comma-separated list. In point of fact, the argument(s) in
brackets can be any number of comma separated keys or key ranges. For
example: bag[1,5,15:18,20] will return a bag
which contains the elements 1, 5, 15, 16, 17, and 20 from bag.
- bag[ipset]
- Return a new Bag which contains only elements in bag that
are also contained in ipset. This is only valid for IP address
bags. The ipset can be included as part of a comma-separated list
of slices, as above.
- bag[ipwildcard]
- Return a new Bag which contains only elements that are also
contained in ipwildcard. This is only valid for IP address bags.
The ipwildcard can be included as part of a comma-separated list of
slices, as above.
- key in bag
- Return True if bag[key] is non-zero, False
otherwise.
- bag.get(key, default=None)
- Return bag[key] if key is in bag, otherwise
return default.
- bag.items()
- Return a list of
(key, value )
pairs for all keys in bag with non-zero values. This list is not
guaranteed to be sorted in any order.
- bag.iteritems()
- Return an iterator over
(key, value) pairs
for all keys in bag with non-zero values. This iterator is not
guaranteed to iterate over items in any order.
- bag.sorted_iter()
- Return an iterator over
(key, value) pairs
for all keys in bag with non-zero values. This iterator is
guaranteed to iterate over items in key-sorted order.
- bag.keys()
- Return a list of keys for all keys in bag with non-zero
values. This list is guaranteed to be in key-sorted order.
- bag.iterkeys()
- Return an iterkeys over keys for all keys in bag with
non-zero values. This iterator is not guaranteed to iterate over keys in
any order.
- bag.values()
- Return a list of values for all keys in bag with non-zero
values. The list is guaranteed to be in key-sorted order.
- bag.itervalues()
- Return an iterator over values for all keys in bag with
non-zero values. This iterator is not guaranteed iterate over values in
any order, but the order is consistent with that returned by
iterkeys().
- bag.group_iterator(bag2)
- Return an iterator over keys and values of a pair of Bags. For each
key which is in either bag or bag2, this iterator
will return a
(key, value, value2)
triple, where value is bag.get(key), and
value2 is bag.get(key). This iterator is guaranteed
to iterate over triples in key order.
- bag + bag2
- Add two bags together. Return a new Bag for which
newbag[key] = bag[key] + bag2[key]
for all keys in bag and bag2. Will raise an
OverflowError if the resulting value for a key is greater than
BAG_COUNTER_MAX. If the two bags are of different types, the
resulting bag will be of a type determined by
Bag.type_merge() .
- bag - bag2
- Subtract two bags. Return a new Bag for which
newbag
[key] =
bag[
key] - bag2[key]
for all keys in bag and bag2, as long as the resulting value
for that key would be non-negative. If the resulting value for a key would
be negative, the value of that key will be zero. If the two bags are of
different types, the resulting bag will be of a type determined by
Bag. type_merge().
- bag.min(bag2)
- Return a new Bag for which
newbag[key] = min(bag[key], bag2[key])
for all keys in bag and bag2.
- bag.max(bag2)
- Return a new Bag for which
newbag[key] = max(bag[key], bag2[key])
for all keys in bag and bag2.
- bag.div(bag2)
- Divide two bags. Return a new Bag for which
newbag
[key] =
bag[
key] / bag2[key])
rounded to the nearest integer for all keys in bag and bag2,
as long as bag2[key] is non-zero.
newbag
[key] = 0 when
bag2[key] is zero. If the two bags are of different types,
the resulting bag will be of a type determined by
Bag.type_merge().
- bag * integer
- integer * bag
- Multiple a bag by a scalar. Return a new Bag for which
newbag[key] = bag[key] * integer
for all keys in bag.
- bag.intersect(set_like)
- Return a new Bag which contains bag[key] for each
key where
key in set_like
is true. set_like is any argument that supports Python's in
operator, including Bags, IPSets, IPWildcards, and
Python sets, lists, tuples, et cetera.
- bag.complement_intersect(set_like)
- Return a new Bag which contains bag[key] for each
key where
key in set_like
is not true.
- bag.ipset()
- Return an IPSet consisting of the set of IP address key values from
bag with non-zero values. This only works if bag is an IP
address bag.
- bag.inversion()
- Return a new integer Bag for which all values from bag are
inserted as key elements. Hence, if two keys in bag have a value of
5, newbag[5] will be equal to two.
- bag == bag2
- Return True if the contents of bag are equivalent to the
contents of bag2, False otherwise.
- bag != bag2
- Return False if the contents of bag are equivalent to the
contents of bag2, True otherwise.
- bag.save(filename, compression=DEFAULT)
- Save the contents of bag in the file filename. The
compression determines the compression method used when outputting
the file. Valid values are the same as those in
silk.silkfile_open().
The following operations and methods will modify the
Bag:
- bag.clear()
- Empty bag, such that bag[key] is zero for all
keys.
- bag[key] = value
- Set the number of key in bag to value.
- del bag[key]
- Remove key from bag, such that bag[key] is
zero.
- bag.update(mapping)
- For each item in mapping, bag is modified such that for each
key in mapping, the value for that key in bag will be set to
the mapping's value. Valid mappings are those accepted by the
Bag() constructor.
- bag.add(key[, key2[, ...]])
- Add one of each key to bag. This is the same as incrementing
the value for each key by one.
- bag.add(iterable)
- Add one of each key in iterable to bag. This is the
same as incrementing the value for each key by one.
- bag.remove(key[, key2[, ...]])
- Remove one of each key from bag. This is the same as
decrementing the value for each key by one.
- bag.remove(iterable)
- Remove one of each key in iterable from bag. This is
the same as decrementing the value for each key by one.
- bag.incr(key, value = 1)
- Increment the number of key in bag by value.
value defaults to one.
- bag.decr(key, value = 1)
- Decrement the number of key in bag by value.
value defaults to one.
- bag += bag2
- Equivalent to
bag = bag
+ bag2, unless an
OverflowError is raised, in which case bag is no longer
necessarily valid. When an error is not raised, this operation takes less
memory than
bag = bag
+ bag2. This operation can
change the type of bag, as determined by
Bag.type_merge() .
- bag -= bag2
- Equivalent to
bag = bag
- bag2. This operation takes
less memory than
bag = bag
- bag2. This operation can
change the type of bag, as determined by
Bag.type_merge() .
- bag *= integer
- Equivalent to
bag = bag
* integer, unless an
OverflowError is raised, in which case bag is no longer
necessarily valid. When an error is not raised, this operation takes less
memory than
bag = bag
* integer.
- bag.constrain_values(min=None,
max=None)
- Remove key from bag if that key's value is less than
min or greater than max. At least one of min or
max must be specified.
- bag.constrain_keys(min=None,
max=None)
- Remove key from bag if that key is less than min, or
greater than max. At least one of min or max must be
specified.
A TCPFlags object represents the eight bits of flags from a TCP session.
- class silk.TCPFlags(value)
- The constructor takes either a TCPFlags value, a string, or an
integer. If a TCPFlags value, it returns a copy of that value. If
an integer, the integer should represent the 8-bit representation of the
flags. If a string, the string should consist of a concatenation of zero
or more of the characters "F",
"S",
"R",
"P",
"A",
"U",
"E", and
"C"---upper or lower-case---representing
the FIN, SYN, RST, PSH, ACK, URG, ECE, and CWR flags. Spaces in the string
are ignored.
Examples:
>>> a = TCPFlags('SA')
>>> b = TCPFlags(5)
Instance attributes (read-only):
- flags.fin
- True if the FIN flag is set on flags, False
otherwise
- flags.syn
- True if the SYN flag is set on flags, False
otherwise
- flags.rst
- True if the RST flag is set on flags, False
otherwise
- flags.psh
- True if the PSH flag is set on flags, False
otherwise
- flags.ack
- True if the ACK flag is set on flags, False
otherwise
- flags.urg
- True if the URG flag is set on flags, False
otherwise
- flags.ece
- True if the ECE flag is set on flags, False
otherwise
- flags.cwr
- True if the CWR flag is set on flags, False
otherwise
Supported operations and methods:
- ~flags
- Return the bitwise inversion (not) of flags
- flags1 & flags2
- Return the bitwise intersection (and) of the flags from flags1 and
flags2
- flags1 | flags2
- Return the bitwise union (or) of the flags from flags1 and
flags2.
- flags1 ^ flags2
- Return the bitwise exclusive disjunction (xor) of the flags from
flags1 and flags2.
- int(flags)
- Return the integer value of the flags set in flags.
- str(flags)
- Return a string representation of the flags set in flags.
- flags.padded()
- Return a string representation of the flags set in flags. This
representation will be padded with spaces such that flags will line up if
printed above each other.
- flags
- When used in a setting that expects a boolean, return True if any
flag value is set in flags. Return False otherwise.
- flags.matches(flagmask)
- Given flagmask, a string of the form
high_flags/mask_flags, return True if the flags of
flags match high_flags after being masked with
mask_flags; False otherwise. Given a flagmask without
the slash ("/"), return True if
all bits in flagmask are set in flags. I.e., a
flagmask without a slash is interpreted as
"flagmask/flagmask".
Constants:
The following constants are defined:
- silk.TCP_FIN
- A TCPFlags value with only the FIN flag set
- silk.TCP_SYN
- A TCPFlags value with only the SYN flag set
- silk.TCP_RST
- A TCPFlags value with only the RST flag set
- silk.TCP_PSH
- A TCPFlags value with only the PSH flag set
- silk.TCP_ACK
- A TCPFlags value with only the ACK flag set
- silk.TCP_URG
- A TCPFlags value with only the URG flag set
- silk.TCP_ECE
- A TCPFlags value with only the ECE flag set
- silk.TCP_CWR
- A TCPFlags value with only the CWR flag set
An FGlob object is an iterable object which iterates over filenames from
a SiLK data store. It does this internally by calling the
rwfglob (1) program. The FGlob object assumes that the
rwfglob program is in the PATH, and will raise an exception when used
if not.
Note: It is generally better to use the
silk.site.repository_iter() function from the
"silk.site Module" instead of the FGlob object, as that
function does not require the external rwfglob program. However, the
FGlob constructor allows you to use a different site configuration
file every time, whereas the silk.site.init_site()
function only supports a single site configuration file.
- class silk.FGlob(classname=None,
type=None, sensors=None,
start_date=None, end_date=None,
data_rootdir=None,
site_config_file=None)
- Although all arguments have defaults, at least one of classname,
type, sensors, start_date must be specified. The
arguments are:
- classname
- if given, should be a string representing the class name. If not given,
defaults based on the site configuration file,
silk.conf(5).
- type
- if given, can be either a string representing a type name or
comma-separated list of type names, or can be a list of strings
representing type names. If not given, defaults based on the site
configuration file, silk.conf.
- sensors
- if given, should be either a string representing a comma-separated list of
sensor names or IDs, and integer representing a sensor ID, or a list of
strings or integers representing sensor names or IDs. If not given,
defaults to all sensors.
- start_date
- if given, should be either a string in the format
"YYYY/MM/DD[:HH]", a date object, a
datetime object (which will be used to the precision of one hour), or a
time object (which is used for the given hour on the current date). If not
given, defaults to start of current day.
- end_date
- if given, should be either a string in the format
"YYYY/MM/DD[:HH]", a date object, a
datetime object (which will be used to the precision of one hour), or a
time object (which is used for the given hour on the current date). If not
given, defaults to start_date. The end_date cannot be
specified without a start_date.
- data_rootdir
- if given, should be a string representing the directory in which to find
the packed SiLK data files. If not given, defaults to the value in the
SILK_DATA_ROOTDIR environment variable or the compiled-in default
(/data).
- site_config_file
- if given, should be a string representing the path of the site
configuration file, silk.conf. If not given, defaults to the value
in the SILK_CONFIG_FILE environment variable or
$SILK_DATA_ROOTDIR/silk.conf.
An FGlob object can be used as a standard iterator. For
example:
for filename in FGlob(classname="all", start_date="2005/09/22"):
for rec in silkfile_open(filename):
...
The silk.site module contains functions that load the SiLK site file, and
query information from that file.
- silk.site.init_site(siteconf=None,
rootdir=None)
- Initializes the SiLK system's site configuration. The siteconf
parameter, if given, should be the path and name of a SiLK site
configuration file (see silk.conf(5)). If
siteconf is omitted, the value specified in the environment
variable SILK_CONFIG_FILE will be used as the name of the configuration
file. If SILK_CONFIG_FILE is not set, the module looks for a file named
silk.conf in the following directories: the directory specified by
the rootdir argument, the directory specified in the
SILK_DATA_ROOTDIR environment variable; the data root directory that is
compiled into SiLK (/data); the directories
$SILK_PATH/share/silk/ and
$SILK_PATH/share/.
The rootdir parameter, if given, should be the path to
a SiLK data repository that a configuration that matches the SiLK site
configuration. If rootdir is omitted, the value specified in the
SILK_DATA_ROOTDIR environment variable will be used, or if that variable
is not set, the data root directory that is compiled into SiLK (/data).
The rootdir may be specified without a siteconf argument
by using rootdir as a keyword argument. I.e.,
init_site(rootdir="/data").
This function should not generally be called explicitly unless
one wishes to use a non-default site configuration file.
The init_site() function can only be
called successfully once. The return value of
init_site() will be true if the site configuration
was successful, or False if a site configuration file was not
found. If a siteconf parameter was specified but not found, or if
a site configuration file was found but did not parse properly, an
exception will be raised instead. Once
init_site() has been successfully invoked,
silk.site.have_site_config() will return
True, and subsequent invocations of
init_site() will raise a RuntimeError
exception.
Some silk.site methods and RWRec members require
information from the silk.conf file, and when these methods are
called or members accessed, the
silk.site.init_site() function is implicitly
invoked with no arguments if it has not yet been called successfully.
The list of functions, methods, and attributes that exhibit this
behavior include: silk.site.sensors(),
silk.site.classtypes(),
silk.site.classes(),
silk.site.types() ,
silk.site.default_types(),
silk.site.default_class(),
silk.site.class_sensors(),
silk.site.sensor_id(),
silk.site.sensor_from_id(),
silk.site.classtype_id(),
silk.site.classtype_from_id(),
silk.site.set_data_rootdir(),
silk.site.repository_iter(),
silk.site.repository_silkfile_iter(),
silk.site. repository_full_iter(),
rwrec.as_dict(),
rwrec.classname,
rwrec.typename,
rwrec.classtype, and
rwrec .sensor.
- silk.site.have_site_config()
- Return True if silk.site.init_site() has been
called and was able to successfully find and load a SiLK configuration
file, False otherwise.
- silk.site.set_data_rootdir(rootdir)
- Change the current SiLK data root directory once the silk.conf file
has been loaded. This function can be used to change the directory used by
the silk.site iterator functions. To change the SiLK data root
directory before loading the silk.conf file, call
silk.site.init_site() with a rootdir argument.
set_data_rootdir() implicitly calls
silk.site.init_site() with no arguments before
changing the root directory if
silk.site.have_site_config() returns
False.
- silk.site.get_site_config()
- Return the current path to the SiLK site configuration file. Before
silk.site.init_site() is called successfully, this
will return the place that init_site() called with no
arguments will first look for a configuration file. After
init_site() has been successfully called, this will
return the path to the file that init_site()
loaded.
- silk.site.get_data_rootdir()
- Return the current SiLK data root directory.
- silk.site.sensors()
- Return a tuple of valid sensor names. Implicitly calls
silk.site.init_site() with no arguments if
silk.site.have_site_config() returns False.
Returns an empty tuple if no site file is available.
- silk.site.classes()
- Return a tuple of valid class names. Implicitly calls
silk.site.init_site() with no arguments if
silk.site.have_site_config() returns False.
Returns an empty tuple if no site file is available.
- silk.site.types(class)
- Return a tuple of valid type names for class class. Implicitly
calls silk.site.init_site() with no arguments if
silk.site.have_site_config() returns False.
Throws KeyError if no site file is available or if class is
not a valid class.
- silk.site.classtypes()
- Return a tuple of valid (class name, type name) tuples. Implicitly calls
silk.site.init_site() with no arguments if
silk.site.have_site_config() returns False.
Returns an empty tuple if no site file is available.
- silk.site.default_class()
- Return the default class name. Implicitly calls
silk.site.init_site() with no arguments if
silk.site.have_site_config() returns False.
Returns None if no site file is available.
- silk.site.default_types(class)
- Return a tuple of default types associated with class class.
Implicitly calls silk.site.init_site() with no
arguments if silk.site.have_site_config() returns
False. Throws KeyError if no site file is available or if
class is not a valid class.
- silk.site.class_sensors(class)
- Return a tuple of sensors that are in class class. Implicitly calls
silk.site.init_site() with no arguments if
silk.site.have_site_config() returns False.
Throws KeyError if no site file is available or if class is
not a valid class.
- silk.site.sensor_classes(sensor)
- Return a tuple of classes that are associated with sensor.
Implicitly calls silk.site.init_site() with no
arguments if silk.site.have_site_config() returns
False. Throws KeyError if no site file is available or if
sensor is not a valid sensor.
- silk.site.sensor_description(sensor)
- Return the sensor description as a string, or None if there is no
description. Implicitly calls silk.site.init_site()
with no arguments if silk.site.have_site_config()
returns False. Throws KeyError if no site file is available
or if sensor is not a valid sensor.
- silk.site.sensor_id(sensor)
- Return the numeric sensor ID associated with the string sensor.
Implicitly calls silk.site.init_site() with no
arguments if silk.site.have_site_config() returns
False. Throws KeyError if no site file is available or if
sensor is not a valid sensor.
- silk.site.sensor_from_id(id)
- Return the sensor name associated with the numeric sensor ID id.
Implicitly calls silk.site.init_site() with no
arguments if silk.site.have_site_config() returns
False. Throws KeyError if no site file is available or if
id is not a valid sensor identifier.
- silk.site.classtype_id( (class, type)
)
- Return the numeric ID associated with the tuple (class,
type). Implicitly calls silk.site.init_site()
with no arguments if silk.site.have_site_config()
returns False. Throws KeyError if no site file is available,
if class is not a valid class, or if type is not a valid
type in class.
- silk.site.classtype_from_id(id)
- Return the (class, type) name pair associated with the
numeric ID id. Implicitly calls
silk.site.init_site() with no arguments if
silk.site.have_site_config() returns False.
Throws KeyError if no site file is available or if id is not
a valid identifier.
- silk.site.repository_iter(start=None,
end=None, classname=None,
types=None, classtypes=None,
sensors=None)
- Return an iterator over file names in a SiLK repository. The repository is
assumed to be in the data root directory that is returned by
silk.site. get_data_rootdir() and to conform to the
format of the current site configuration. This function implicitly calls
silk.site.init_site() with no arguments if
silk.site.have_site_config() returns False.
See also silk.site.repository_full_iter() and
silk.site.repository_silkfile_iter().
The following types are accepted for start and
end:
- a datetime.datetime object, which is considered to be specified to
hour precision
- a datetime.date object, which is considered to be specified to day
precision
- a string in the SiLK date format
"YYYY/MM/DD[:HH]", where the timezone
depends on how SiLK was compiled; check the value of
silk.get_configuration("TIMEZONE_SUPPORT").
The rules for interpreting start and end are:
- When both start and end are specified to hour precision,
files from all hours within that time range are returned.
- When start is specified to day precision, the hour specified in
end (if any) is ignored, and files for all dates between midnight
at start and the end of the day represented by end are
returned.
- When end is not specified and start is specified to day
precision, files for that complete day are returned.
- When end is not specified and start is specified to hour
precision, files for that single hour are returned.
- When neither start nor end are specified, files for the
current day are returned.
- It is an error to specify end without start, or to give an
end that proceeds start.
To specify classes and types, either use the classname and
types parameters or use the classtypes parameter. It is an
error to use classname or types when classtypes is
specified.
The classname parameter should be a named class that
appears in silk.site.classes(). If neither
classname nor classtypes are specified, classname will
default to that returned by
silk.site.default_class().
The types parameter should be either a named type that
appears in silk.site.types(classname) or a sequence of
said named types. If neither types nor classtypes is
specified, types will default to
silk.site.default_types(classname).
The classtypes parameter should be a sequence of
(classname, type) pairs. These pairs must be in the sequence
returned by silk.site.classtypes().
The sensors parameter should be either a sensor name or a
sequence of sensor names from the sequence returned by
silk.site.sensors(). If sensors is left
unspecified, it will default to the list of sensors supported by the given
class(es).
- silk.site.repository_silkfile_iter(start=None,
end=None, classname=None,
types=None, classtypes=None,
sensors=None)
- Works similarly to silk.site.repository_iter() except
the file names that repository_iter() would return
are opened as SilkFile objects and returned.
- silk.site.repository_full_iter(start=None,
end=None, classname=None,
types=None, classtypes=None,
sensors=None)
- Works similarly to silk.site.repository_iter().
Unlike repository_iter(), this iterator's output will
include the names of files that do not exist in the repository. The
iterator returns (filename, bool) pairs where the
bool value represents whether the given filename exists. For
more information, see the description of the --print-missing-files
switch in rwfglob(1).
silk.plugin is a module to support using PySiLK code as a plug-in to the
rwfilter(1), rwcut(1),
rwgroup(1), rwsort(1),
rwstats(1), and rwuniq(1)
applications. The module defines the following methods, which are described in
the silkpython(3) manual page:
- silk.plugin.register_switch(switch_name,
handler=handler, [arg=needs_arg],
[help=help_string])
- Define the command line switch --switch_name
that can be used by the PySiLK plug-in.
- silk.plugin.register_filter(filter,
[finalize=finalize],
[initialize=initialize])
- Register the callback function filter that can be used by
rwfilter to specify whether the flow record passes or fails.
- silk.plugin.register_field(field_name,
[add_rec_to_bin=add_rec_to_bin,]
[bin_compare=bin_compare,]
[bin_bytes=bin_bytes,]
[bin_merge=bin_merge,]
[bin_to_text=bin_to_text,]
[column_width=column_width,]
[description=description,]
[initial_value=initial_value,]
[initialize=initialize,]
[rec_to_bin=rec_to_bin,]
[rec_to_text=rec_to_text])
- Define the new key field or aggregate value field named field_name.
Key fields can be used in rwcut, rwgroup, rwsort,
rwstats, and rwuniq. Aggregate value fields can be used in
rwstats and rwuniq. Creating a field requires specifying one
or more callback functions---the functions required depend on the
application(s) where the field will be used. To simplify field creation
for common field types, the remaining functions can be used instead.
- silk.plugin.register_int_field(field_name,
int_function, min, max,
[width])
- Create the key field field_name whose value is an unsigned
integer.
- silk.plugin.register_ipv4_field(field_name,
ipv4_function, [width])
- Create the key field field_name whose value is an IPv4
address.
- silk.plugin.register_ip_field(field_name,
ipv4_function, [width])
- Create the key field field_name whose value is an IPv4 or IPv6
address.
- silk.plugin.register_enum_field(field_name,
enum_function, width,
[ordering])
- Create the key field field_name whose value is a Python object
(often a string).
- silk.plugin.register_int_sum_aggregator(agg_value_name,
int_function, [max_sum],
[width])
- Create the aggregate value field agg_value_name that maintains a
running sum as an unsigned integer.
- silk.plugin.register_int_max_aggregator(agg_value_name,
int_function, [max_max],
[width])
- Create the aggregate value field agg_value_name that maintains the
maximum unsigned integer value.
- silk.plugin.register_int_min_aggregator(agg_value_name,
int_function, [max_min],
[width])
- Create the aggregate value field agg_value_name that maintains the
minimum unsigned integer value.
The following is an example using the PySiLK bindings. The code is meant to show
some standard PySiLK techniques, but is not otherwise meant to be useful.
The code reads each record in a SiLK flow file, checks whether the
record's source port is 80/tcp or 8080/tcp and its volume is larger than 3
packets and 120 bytes, stores the destination IP of matching records in an
IPset, and writes the IPset to a destination file. In addition, it prints
the number of unique destination addresses and the addresses themselves to
the standard output. Additional explanations can be found in-line in the
comments.
#! /usr/bin/python
# Use print functions (Compatible with Python 3.0; Requires 2.6+)
from __future__ import print_function #Python2.6 or later required
# Import the PySiLK bindings
from silk import *
# Import sys for the command line arguments.
import sys
# Main function
def main():
if len(sys.argv) != 3:
print ("Usage: %s infile outset" % sys.argv[0])
sys.exit(1)
# Open a silk flow file for reading
infile = silkfile_open(sys.argv[1], READ)
# Create an empty IPset
destset = IPSet()
# Loop over the records in the file
for rec in infile:
# Do comparisons based on rwrec field values
if (rec.protocol == 6 and rec.sport in [80, 8080] and
rec.packets > 3 and rec.bytes > 120):
# Add the dest IP of the record to the IPset
destset.add(rec.dip)
# Save the IPset for future use
try:
destset.save(sys.argv[2])
except:
sys.exit("Unable to write to %s" % sys.argv[2])
# count the items in the set
count = 0
for addr in destset:
count = count + 1
print("%d addresses" % count)
# Another way to do the same
print("%d addresses" % len(destset))
# Print the ip blocks in the set
for base_prefix in destset.cidr_iter():
print("%s/%d" % base_prefix)
# Call the main() function when this program is started
if __name__ == '__main__':
main()
Normally SiLK flow records get stamped with a class as flow records are recorded
in the repository. However, if you are importing raw packet data or need to
change some records that inadvertantly have the wrong class/type, PySiLK makes
it easy to fix.
The example below sets the class to "all" and assigns a
type of "in", "inweb", "out", or
"outweb" to each record in an input file. The direction (in or
out) is defined by an IPset that represents the internal network (traffic
that neither comes from nor goes to the internal network is discarded in
this example). Web/non-web flows are separated based on port.
#! /usr/bin/python
from __future__ import print_function #Python2.6 or later required
from silk import *
import silk.site
import sys # for command line args
from datetime import timedelta # for date math
webports = (80,443,8080)
inwebtype = ("all","inweb")
intype = ("all","in")
outwebtype = ("all","outweb")
outtype = ("all","out")
def main():
if len(sys.argv) != 4:
print("Usage: %s infile setfile outfile" % sys.argv[0])
sys.exit(1)
# open the SiLK file for reading
infile = silkfile_open(sys.argv[1], READ)
# open the set file which represents my internal network
#print(sys.argv[2])
setfile = IPSet.load (sys.argv[2])
# open the modified output file
outfile = silkfile.open(sys.argv[3], WRITE)
# loop over the records in the file, shift time and write the update:
for rec in infile:
#
# If the src ip is in the set, it's going out.
# If the dst ip is in the set, it's coming in.
# If neither IP is in the set, discard the record.
#
if (rec.sport in webports) or (rec.dport in webports):
if rec.sip in setfile:
rec.classtype = outwebtype
outfile.write(rec)
elif rec.dip in setfile:
rec.classtype = inwebtype
outfile.write(rec)
else:
if rec.sip in setfile:
rec.classtype = outtype
outfile.write(rec)
elif rec.dip in setfile:
rec.classtype = intype
outfile.write(rec)
# clean up
outfile.close()
infile.close()
if __name__ == '__main__':
main()
On occasion you may find that you need to adjust all the timestamps for a SiLK
flow file. For example, the flow file came from a packet capture file that was
collected in a different time zone and had to be shifted a number of hours.
Another possibility is if you need to adjust files because you determine the
clock time was off.
It is relatively simple to change the timestamps using PySiLK. The
sample code for changing data to another time zone is shown below; a minor
change would shift the data by seconds instead of hours.
#! /usr/bin/python
from __future__ import print_function #Python2.6 or later required
from silk import *
import sys # for command line args
from datetime import timedelta # for date math
def main():
if len(sys.argv) != 4:
print ("Usage: %s infile offset-hours outfile" % sys.argv[0])
sys.exit(1)
# open the SiLK file for reading
infile = silkfile_open(sys.argv[1], READ)
# create the time offset object
offset = timedelta(hours=int(sys.argv[2]))
# open the modified output file
outfile = silkfile_open(sys.argv[3], WRITE)
# loop over the records in the file, shift time and write the update:
for rec in infile:
rec.stime = rec.stime + offset
outfile.write(rec)
# clean up
outfile.close()
infile.close()
if __name__ == '__main__':
main()
The following script attempts to group all flows representing one direction of
an FTP session and print them together. It takes as an argument the name of a
file containing raw SiLK records sorted by start time and port number
("rwsort --fields=stime,sport"). The script
extracts from the file all flows that potentially represent FTP traffic. We
define a possible FTP flow as any flow where:
- the source port is 21 (FTP control channel)
- the source port is 20 (FTP data transfer port )
- both the source port and destination port are ephemeral (data
transfer)
If a flow record has a source port of 21, the script adds the
source and destination address to the list of possible FTP groups. The
script categorizes each data transfer flow (source port 20 or ephemeral to
ephemeral) according to its source and destination IP address pair. If a
flow from the control channel with the same source and destination IP
address exists the source and destination ports in the flow are added to the
list of ports associated with the control channel interaction, otherwise the
script lists the data transfer as being unclassified. After the entire file
is processed, all FTP sessions that have been grouped are displayed.
#! /usr/bin/python
from __future__ import print_function #Python2.6 or later required
# import the necessary modules
import silk
import sys
# Test that the argument number is correct
if (len(sys.argv) != 2):
print("Must supply a SiLK data file.")
sys.exit()
# open the SiLK file for reading
rawFile=silk.silkfile_open(sys.argv[1], silk.READ)
# Initialize the record structure
# Unclassified will be the record ephemeral to ephemeral
# connections that don't appear to have a control channel
interactions = {"Unclassified":[]}
# Count of records processed
count = 0
# Process the input file
for rec in rawFile:
count += 1
key="%15s <--> %15s"%(rec.sip,rec.dip)
if (rec.sport==21):
if not key in interactions:
interactions[key] = []
else:
if key in interactions:
interactions[key].append("%5d <--> %5d"%(rec.sport,rec.dport))
else:
interactions["Unclassified"].append(
"%15s:%5d <--> %15s:%5d"%(rec.sip,rec.sport,rec.dip,rec.dport))
# Print the count of all records
print(str(count) + " records processed")
# Print the groups of FTP flows
keyList = sorted(interactions.keys())
for key in keyList:
print("\n" + key + " " + str(len(interactions[key])))
if (key != "Unclassified"):
for line in interactions[key]:
print(" " + line)
Example output of the script:
184 records processed
xxx.xxx.xxx.236 <--> yyy.yyy.yyy.231 3
20 <--> 56180
20 <--> 56180
20 <--> 58354
Unclassified 158
The following environment variables affect the tools in the SiLK tool suite.
- SILK_CONFIG_FILE
- This environment variable contains the location of the site configuration
file, silk.conf. This variable will be used by
silk.site.init_site() if no argument is passed to
that method.
- SILK_DATA_ROOTDIR
- This variable gives the root of directory tree where the data store of
SiLK Flow files is maintained, overriding the location that is compiled
into the tools (/data). This variable will be used by the FGlob
constructor unless an explicit data_rootdir value is specified. In
addition, the silk.site.init_site() may search for
the site configuration file, silk.conf, in this directory.
- SILK_COUNTRY_CODES
- This environment variable gives the location of the country code mapping
file that the silk.init_country_codes() function will
use when no name is given to that function. The value of this environment
variable may be a complete path or a file relative to the SILK_PATH. See
the "FILES" section for standard locations of this file.
- SILK_CLOBBER
- The SiLK tools normally refuse to overwrite existing files. Setting
SILK_CLOBBER to a non-empty value removes this restriction.
- SILK_PATH
- This environment variable gives the root of the install tree. When
searching for configuration files, PySiLK may use this environment
variable. See the "FILES" section for details.
- PYTHONPATH
- This is the search path that Python uses to find modules and extensions.
The SiLK Python extension described in this document may be installed
outside Python's installation tree; for example, in SiLK's installation
tree. It may be necessary to set or modify the PYTHONPATH environment
variable so Python can find the SiLK extension.
- PYTHONVERBOSE
- If the SiLK Python extension fails to load, setting this environment
variable to a non-empty string may help you debug the issue.
- SILK_PYTHON_TRACEBACK
- When set, Python plug-ins (see silkpython(3)) will
output trace back information regarding Python errors to the standard
error.
- PATH
- This is the standard search path for executable programs. The FGlob
constructor will invoke the rwfglob(1) program; the
directory containing rwfglob should be included in the PATH.
- TZ
- When a SiLK installation is built to use the local timezone (to determine
if this is the case, check the value of
silk.get_configuration("TIMEZONE_SUPPORT")), the value of
the TZ environment variable determines the timezone in which
silk.site. repository_iter() parses timestamp
strings. If the TZ environment variable is not set, the default timezone
is used. Setting TZ to 0 or the empty string causes timestamps to be
parsed as UTC. The value of the TZ environment variable is ignored when
the SiLK installation uses utc. For system information on the TZ variable,
see tzset(3).
- ${SILK_CONFIG_FILE}
- ROOT_DIRECTORY/silk.conf
- ${SILK_PATH}/share/silk/silk.conf
- ${SILK_PATH}/share/silk.conf
- /usr/local/share/silk/silk.conf
- /usr/local/share/silk.conf
- Possible locations for the SiLK site configuration file which are checked
when no argument is passed to
silk.site.init_site().
- ${SILK_COUNTRY_CODES}
- ${SILK_PATH}/share/silk/country_codes.pmap
- ${SILK_PATH}/share/country_codes.pmap
- /usr/local/share/silk/country_codes.pmap
- /usr/local/share/country_codes.pmap
- Possible locations for the country code mapping file used by
silk.init_country_codes() when no name is given to
the function.
- ${SILK_DATA_ROOTDIR}/
- /data/
- Locations for the root directory of the data repository. The
silk.site. init_site() may search for the site
configuration file, silk.conf, in this directory.
silkpython(3), rwfglob(1),
rwfileinfo(1), rwfilter(1),
rwcut(1), rwpmapbuild(1),
rwset(1), rwsetbuild(1),
rwgroup(1), rwsort(1),
rwstats(1), rwuniq(1),
rwgeoip2ccmap(1), silk.conf(5),
sensor.conf(5), silk(7),
python(1), gzip(1),
yaf (1), tzset(3),
<http://docs.python.org/>
Visit the GSP FreeBSD Man Page Interface. Output converted with ManDoc. |