|
NAMErwmatch - Match SiLK records from two streams into a common streamSYNOPSISrwmatch --relate=FIELD_PAIR [--relate=FIELD_PAIR ...] [--time-delta=DELTA] [--symmetric-delta] [{ --absolute-delta | --relative-delta | --infinite-delta }] [--unmatched={q|r|b}] [--note-add=TEXT] [--note-file-add=FILE] [--ipv6-policy={ignore,asv4,mix,force,only}] [--compression-method=COMP_METHOD] [--site-config-file=FILENAME] QUERY_FILE RESPONSE_FILE OUTPUT_FILE rwmatch --help rwmatch --help-relate rwmatch --version DESCRIPTIONrwmatch provides a facility for relating (or matching) SiLK Flow records contained in two sorted input files, labeling those flow records, and writing the records to an output file.The two input files are called QUERY_FILE and RESPONSE_FILE, respectively. The purpose of rwmatch is to find a record in QUERY_FILE that represents some network stimulus that caused a reply which is represented by a record in RESPONSE_FILE. When rwmatch discovers this relationship, it assigns a numeric ID to the match, searches both input files for additional records that are part of the same event, stores the numeric ID in each matching record's next hop IP field, and writes all records that are part of that event to OUTPUT_FILE. When the --symmetric-delta switch is specified, rwmatch also checks for a stimulus in RESPONSE_FILE that triggered a reply in QUERY_FILE. This is useful when matching flows where either side may have initiated the conversation. The input files must be sorted as described in "Sorting the input" below. To use the standard input in place of one of the input streams, specify "stdin" or "-" in its place. The criteria for defining a match are given by one of more uses of the --relate switch and by the timestamps on the flow records:
Once rwmatch establishes a match between records in the two input files, it searches for additional records from both input files to add to the match. To do this, rwmatch denotes one of the records that comprise the initial match pair as a base record. When possible, the base record is the record with the earlier start time. In the case of a tie, the base is determined by ports for TCP and UDP with the base being that with the lower port if one is above 1024 and the other below 1024. If that also fails, the base record is the record read from QUERY_FILE. With millisecond time resolution, ties should be rare. To determine whether a match exists between the base record and a candidate record, rwmatch uses the FIELD_PAIRs specified by --relate. When the base record and the candidate record were read from the same file, only one side of each FIELD_PAIR is used. In addition to the records having identical values for each field in FIELD_PAIRs, the candidate record must be within a time window determined by the --time-delta switch and the --absolute-delta, --relative-delta, and --infinite-delta switches.
Because long-lived sessions are often broken into multiple flows, rwmatch may discard records that are part of a long-lived session. The --relative-delta switch may compensate for this if the gap between flows is less that the time specified in the --time-delta switch. The --infinite-delta will compensate for arbitrarily long gaps, but it may add records to a match that are not part of a true session. DNS flows that use port 53/udp as both a service and reply port are an example. When rwmatch establishes a match, it increments the match ID, with the first match having a match ID of 1. To label the records that comprise the match, rwmatch uses a 32-bit number where the lower 24-bits hold the match ID and the upper 8-bits is set to 0 or 255 to indicate whether the record was read from QUERY_FILE or RESPONSE_FILE, respectively. rwmatch stores this 32-bit number in the next hop IP field of the records. If the record is IPv6, rwmatch maps the number into the ::ffff:0:0/96 netblock before modifying setting the next hop IP. Apart from the change to the next hop IP field, the query and response records are not modified. By default, only matched records are written to the OUTPUT_FILE and any record that could not be determined to be part of a match is discarded. Specifying the --unmatched switch tells rwmatch to write unmatched query and/or response records to OUTPUT_FILE. The required parameter is one of "q", "r", or "b" to write the query records, the response records, or both to OUTPUT_FILE. Unmatched query records have their next hop IP set to 0.0.0.0, and unmatched response records have their next hop IP set to 255.0.0.0. Sorting the inputAs rwmatch reads QUERY_FILE and RESPONSE_FILE, it expects the SiLK Flow records to appear in a particular order that is best achieved by using rwsort(1). In particular:
When rwmatch processes the following command $ rwmatch --relate=1,2 --relate=2,1 --relate=5,5 Q.rw R.rw out.rw it assumes the file1.rw and file2.rw were created by $ rwsort --fields=1,2,5,stime --output=Q.rw input1.rw .... $ rwsort --fields=2,1,5,stime --output=R.rw input2.rw .... If the files source_ips.s.rw and dest_ips.s.rw are created by the following commands: $ rwsort --field=1,9 source_ips.rw > source_ips.s.rw $ rwsort --field=2,9 dest_ips.rw > dest_ips.s.rw The following call to rwmatch works correctly: $ rwmatch --relate=1,2 source_ips.s.rw dest_ips.s.rw matched.rw Note that the following command produces very few matches since source_ips.s.rw was sorted on field 1 and dest_ips.s.rw was sorted on field 2. $ rwmatch --relate=2,1 source_ips.s.rw dest_ips.s.rw stdout The recommended sort ordering for TCP and UDP is shown below. This correctly handles multiple flows occurring during the same time interval which involve multiple ports: $ rwsort --fields=1,4,2,3,5,stime incoming.rw > incoming-query.rw $ rwsort --fields=2,3,1,4,5,stime outgoing.rw > outgoing-response.rw The corresponding rwmatch command is: $ rwmatch --relate=1,2 --relate=4,3 --relate=2,1 --relate=3,4 \ --relate=5,5 incoming-query.rw outgoing-response.rw matched.rw OPTIONSOption names may be abbreviated if the abbreviation is unique or is an exact match for an option. A parameter to an option may be specified as --arg=param or --arg param, though the first form is required for options that take optional parameters.
EXAMPLESIn the following examples, the dollar sign ("$") represents the shell prompt. The text after the dollar sign represents the command line. Lines have been wrapped for improved readability, and the back slash ("\") is used to indicate a wrapped line.Matching TCP Flowsrwmatch is a generalized matching tool; the most basic function provided by rwmatch is the ability to match both sides of a TCP connection. Given incoming and outgoing web traffic in two files web_in.rw and web_out.rw, the following sequence of commands will generate a file, web-sessions.rw consisting of matched sessions for every complete web session in web_in.rw and web_out.rw:$ rwsort --field=1,2,3,4,stime web_in.rw > web_in-s.rw $ rwsort --field=2,1,4,3,stime web_out.rw > web_out-s.rw $ rwmatch --relate=1,2 --relate=2,1 --relate=3,4 --relate=4,3 \ web_in-s.rw web_out-s.rw web-sessions.rw Finding Responses to a ScanBecause rwmatch can match fields arbitrarily, you can also match records across different protocols. Suppose there are two SiLK Flow files, indata.rw and outdata.rw, that contain the incoming and outgoing data, respectively, for a particular time period.To trace responses to a scan attempt, we start by identifying a specific horizontal scan. In this example, we use an SMTP scan on TCP port 25. Assume that we have an IPset file, smtp-scanners.set, that contains the external IP addresses that scanned us port port 25. (Perhaps this file was obtained by using rwscan(1) and rwscanquery(1).) First, use rwfilter(1) to find the flow records matching these scan attempts in the incoming data file. Sort the output of rwfilter by source IP, source port, destination IP, destination port, and time, and store the results in smtp-scans.rw: $ rwfilter --proto=6 --sip-set=smtp-scanners.set --dport=25 \ --pass=- indata.rw \ | rwsort --field=sip,sport,dip,dport,stime > smtp-scans.rw We can identify hosts that responded to the scan (we consider a accepting the TCP connection as a response) by finding potential replies in the outgoing data file, sorting them, and storing the results in scan-response.rw. For this command on the outgoing data, note that we must swap source and destination from the values used for the incoming data: $ rwfilter --proto=6 --dip-set=smtp-scanners.set --sport=25 \ --pass=- outdata.rw \ | rwsort --field=dip,dport,sip,sport,stime > scan-response.rw We can now match the flow records to produce the file matched-scans.rw: $ rwmatch --relate=1,2 --relate=3,4 --relate=2,1 --relate=4,3 \ smtp-scans.rw scan-response.rw matched-scans.rw The results file, matched-scans.rw, will contain all the exchanges between the scanning hosts and the responders on port 25. Examination of these flows may show evidence of buffer overflows, data exfiltration, or similar attacks. Next, we want to identify responses to the scan that were produced by our routers, such as ICMP destination unreachable messages. Use rwfilter to find the ICMP messages going to the scanning hosts, sort the flow records, and store the results in icmp.rw: $ rwfilter --proto=1 --icmp-type=3 --pass=stdout outdata.rw \ | rwsort --field=dip,stime > icmp.rw Run rwmatch and match exclusively on the IP address. $ rwmatch --relate=2,1 icmp.rw smtp-scans.rw result.rw The resulting file, result.rw will consist of single packet flows (from smtp-scans.rw) with an ICMP response (from icmp.rw). Similar queries can be used to identify other multiple-protocol phenomena, such as the results of a traceroute. Displaying the ResultsThese examples assume matched.rw is an output file produced by rwmatch.When using rwcut(1) to display the records in matched.rw, you may specify the next hop IP field ("nhIP") to see the match identifier: $ rwcut --num-rec=8 --fields=sip,sport,dip,dport,type,nhip matched.rw sIP|sPort| dIP|dPort| type| nhIP| 10.4.52.235|29631|192.168.233.171| 80| inweb| 0.0.0.1| 192.168.233.171| 80| 10.4.52.235|29631| outweb| 255.0.0.1| 10.9.77.117|29906| 192.168.184.65| 80| inweb| 0.0.0.2| 192.168.184.65| 80| 10.9.77.117|29906| outweb| 255.0.0.2| 10.14.110.214|29989| 192.168.249.96| 80| inweb| 0.0.0.3| 192.168.249.96| 80| 10.14.110.214|29989| outweb| 255.0.0.3| 10.18.66.79|29660| 192.168.254.69| 80| inweb| 0.0.0.4| 192.168.254.69| 80| 10.18.66.79|29660| outweb| 255.0.0.4| The first record is a query from the external host 10.4.52.235 to the web server on the internal host 192.168.233.171, and the second record is the web server's response. The third and fourth records represent another query/response pair. The cutmatch(3) plug-in is an alternate way to display the match parameter that rwmatch writes into the next hop IP field. The cutmatch plug-in defines a "match" field that displays the direction of the flow ("->" represents a query and "<-" a response) and the match ID. To use the plug-in, you must explicit load it into rwcut by specifying the --plugin switch. You can then add match to the list of --fields to print: $ rwcut --plugin=cutmatch.so --num-rec=8 \ --fields=sip,sport,match,dip,dport,type matched.rw sIP|sPort| <->Match#| dIP|dPort| type| 10.4.52.235|29631|-> 1|192.168.233.171| 80| inweb| 192.168.233.171| 80|<- 1| 10.4.52.235|29631| outweb| 10.9.77.117|29906|-> 2| 192.168.184.65| 80| inweb| 192.168.184.65| 80|<- 2| 10.9.77.117|29906| outweb| 10.14.110.214|29989|-> 3| 192.168.249.96| 80| inweb| 192.168.249.96| 80|<- 3| 10.14.110.214|29989| outweb| 10.18.66.79|29660|-> 4| 192.168.254.69| 80| inweb| 192.168.254.69| 80|<- 4| 10.18.66.79|29660| outweb| Using the "sIP" and "dIP" fields is confusing when the file you are examining contains both incoming and outgoing flow records. To make the output from rwmatch more clear, use the int-ext-fields (3) plug-in as well. That plug-in allows you to display the external IPs in one column and the internal IPs in a another column. See its manual page for additional information. $ export INCOMING_FLOWTYPES=all/in,all/inweb $ export OUTGOING_FLOWTYPES=all/out,all/outweb $ rwcut --plugin=cutmatch.so --plugin=int-ext-fields.so --num-rec=8 \ --fields=ext-ip,ext-port,match,int-ip,int-port,proto matched.rw ext-ip|ext-p| <->Match#| int-ip|int-p| type| 10.4.52.235|29631|-> 1|192.168.233.171| 80| inweb| 10.4.52.235|29631|<- 1|192.168.233.171| 80| outweb| 10.9.77.117|29906|-> 2| 192.168.184.65| 80| inweb| 10.9.77.117|29906|<- 2| 192.168.184.65| 80| outweb| 10.14.110.214|29989|-> 3| 192.168.249.96| 80| inweb| 10.14.110.214|29989|<- 3| 192.168.249.96| 80| outweb| 10.18.66.79|29660|-> 4| 192.168.254.69| 80| inweb| 10.18.66.79|29660|<- 4| 192.168.254.69| 80| outweb| ENVIRONMENT
FILES
SEE ALSOrwfilter(1), rwsort(1), rwcut(1), rwfileinfo(1), rwscan(1), rwscanquery(1), cutmatch(3), int-ext-fields(3), sensor.conf(5), silk(7), zlib(3)NOTESSiLK 3.9.0 expanded the set of fields accepted by the --relate switch and added support for IPv6 flow records.
Visit the GSP FreeBSD Man Page Interface. |