NAME

CarrierIn - a FlowScan module for reporting on carrier or ISP input traffic


SYNOPSIS

   $ flowscan CarrierIn

or in flowscan.cf:

   ReportClasses CarrierIn


DESCRIPTION

CarrierIn is a general flowscan report for reporting on flows of input traffic for a carrier or ISP. It does this by processing flows reported by one or more routers at the network border. The carrier is thought to have an Autonomous System (AS) and BGP protocol running on the Netflow exporting routers.

CarrierIn relies on the fact that NetFlow is turned on at inbound interfaces only.

CarrierIn is based on CampusIO.pm written by Dave Plonka.

flowscan will run the CarrierIn report if you configure this in your flowscan.cf:

   ReportClasses CarrierIn

The difference to Dave Plonka's CampusIO.pm is as follows:


CONFIGURATION

CarrierIn's configuration file is CarrierIn.cf. This configuration file is located in the directory in which the flowscan script resides.

Configuration directives removed from CampusIO.pm:

NextHops
OutputIfIndexes
WebProxyIfIndex
LocalSubnetFiles
Napster*

New configuration directives are:

SamplingRatio
SubnetFiles
CutLongFlows
RRDetailedDays
RRHWPredict
ReportDirectoryPath
ReportIndexName
PostReportExec
WhoisURL
TopHistAlpha
TopHistFile
OriginASHistDir

The CarrierIn configuration directives include:

OutputDir
This directive is required. It is the directory in which RRD files will be written. E.g.:
   # OutputDir /var/local/flows/graphs
   OutputDir graphs

SamplingRatio
This directive is optional. You need to use it if you configure ip flow-sampling-mode packet-interval on your routers.

SubnetFiles
This directive is optional. It is a a comma-seperated list of files containing the definitions of the subnets on which you'd like to report. E.g.:
   # SubnetFiles our_subnets.boulder
   SubnetFiles bin/our_subnets.boulder

Each file contains network definitions in Boulder format. For each subnet you can specify optional name and level. The name is used as the symbolic representation of this subnet and will be used for RRD database file names.

Example:

    SUBNET=195.2.0.0/16
    NAME=whole_62_2 
    LEVEL=0
    =
    SUBNET=195.2.20.0/22
    NAME=my_favorite_customer
    LEVEL=1
    =
    SUBNET=192.2.120.0/21
    NAME=dialin_pool
    LEVEL=1
    =
    SUBNET=192.2.128.0/20
    NAME=dialin_pool
    LEVEL=1
    =

You need to specify levels if you want to collect statistics on nested subnets, like in the eample above. Each level consists of a separate Patricia tree, thus allowing for nested counters. If the level is not specified, the subnet is put into Level 0.

Several subnets can have the same names. In such case, they will have common counters. This is useful when you have non-contiguous address pools for some common purposes.

CutLongFlows yes|no
Default: yes. The active flow timeout is by default 30 minutes on Cisco routers. It causes the whole 30-minute data be counted as 5-minute flow, if you disable this option. Alternatively, you can set the active flow timeout to 5 or less minutes on your routers. On FlowScan installation page, it is recommended to set the timeout to 1 minute.

However, if you need the collected data for raw traffic overview only, this option might be useful for multi-gigabit Internet upstream.

RRDetailedDays
Default: 14. Specifies the size of the Round-Robin Archive to store detailed 5-minute data samples, in days. The half-hour aggregated data is stored for 3*RRDetailedDays days, and the daily aggregated data is stored for 2 years.

RRHWPredict yes|no
Default: no. This is experimental option which causes the creation of the RRD databases with Aberrant Behavior Detection with Holt-Winters Forecasting. At the moment of writing, this feature is available in CVS snapshots only, and will be available in release 1.1.x of RRDTools.

ReportDirectoryPath
Optional. Default: same as OutputDir. This is the directory where all HTML reports are saved. If specified, it must be the absolute path.

ReportIndexName
Optional. Default: index.hml. This file will contain the listing of all reports created. It is generated only if ReportPrefixFormat is not specified.

PostReportExec
Optional. You can specify a command which will be executed after all the reports are written. Make sure this command finishes in short time, otherwise flowscan will not have enough time for other flows processing.

WhoisURL
Optional. Default: ``T<http://www.arin.net/cgi-bin/whois.pl?queryinput=AS+>''. In TopN Origin and Path AS reports, the AS numbers can be looked up in a WHOIS database. For RIPE lookups, use ``T<http://www.ripe.net/perl/whois?AS>''.

TopHistAlpha
Optional. If defined, TopN talkers are memorized by means of exponential smoothing with the given parameter.

TopHistFile
Optional. Must be specified if TopHistAlpha option is used. This is the file where all TopN history averages are stored.

OriginASHistDir
Optional. If specified, TopN*4 RRDtool databases are created in specified directory. For each of the top N*2 origin and path ASn's four-week hourly statistics is stored.

TCPServices
This directive is optional, but is required if you wish to produce the CarrierIn service graphs. It is a comma-seperated list of TCP services by name or number. E.g., it is recommended that it contain at least the services shown here:
   # TCPServices ftp-data, ftp, smtp, nntp, http, 7070, 554
   TCPServices ftp-data, ftp, smtp, nntp, http, 7070, 554

UDPServices
This directive is optional. It is a comma-seperated list of UDP services by name or number. E.g.:
   # UDPServices domain, snmp, snmp-trap

Protocols
This directive is optional, but is required if you wish to produce the CarrierIn protocol graphs. It is a comma-seperated list of IP protocols by name. E.g.:
   # Protocols icmp, tcp, udp
   Protocols icmp, tcp, udp

ASPairs
This directive is optional, but is required if you wish to build any custom AS graphs. It is a list of source and destination AS pairs. E.g.:
   # source_AS:destination_AS, e.g.:
   # ASPairs 0:0
   ASPairs 0:0

Note that the effect of setting ASPairs will be different based on whether you specified ``peer-as'' or ``origin-as'' when you configured your Cisco. This option was intended to be used when ``peer-as'' is configured.

See the BGPDumpFile directive for other AS-related features.

Verbose
This directive is optional. If non-zero, it makes flowscan more verbose with respect to messages and warnings. Currently the values 1 and 2 are understood, the higher value causing more messages to be produced. E.g.:
   # Verbose (OPTIONAL, non-zero = true)
   Verbose 1

TopN
This directive is optional. It's use requires the HTML::Table perl module. TopN is the number of entries to show in the tables that will be generated in HTML top reports. E.g.:
   # TopN (OPTIONAL)
   TopN 10

If you'd prefer to see hostnames rather than IP addresses in your top reports, use the ip2hostname script. E.g.:

   $ ip2hostname -I *.*.*.*_*.html

ReportPrefixFormat
This directive is optional. It is used to specify the file name prefix for the HTML or text reports such as the ``originAS'', ``pathAS'', and ``Top Talkers'' reports. You should use strftime(3) format specifiers in the value, and it may also specify sub-directories. If not set, the prefix defaults to the null string, which means that, every five minutes, subsequent reports will overwrite the previous. E.g.:
   # Preserve one day of HTML reports using the time of day as the dir name:
   ReportPrefixFormat html/CarrierIn/%H:%M/

or:

   # Preserve one month by using the day of month in the dir name (like sar(1)):
   ReportPrefixFormat html/CarrierIn/%d/%H:%M_

BGPDumpFile
This directive is optional and is experimental. In combination with TopN and ASNFile it causes FlowScan to produce ``Top ASN'' reports which show the ``top'' Autonomous Systems with which your site exchanges traffic.

BGPDumpFile requires the ParseBGPDump perl module by Sean McCreary, which is supplied with CAIDA's CoralReef Package:

   http://www.caida.org/tools/measurement/coralreef/status.xml

Unfortunately, CoralReef is governed by a different license than FlowScan itself. The Copyright file says this:

   Permission to use, copy, modify and distribute any part of this
   CoralReef software package for educational, research and non-profit
   purposes, without fee, and without a written agreement is hereby
   granted, provided that the above copyright notice, this paragraph
   and the following paragraphs appear in all copies.
   [...]
   The CoralReef software package is developed by the CoralReef
   development team at the University of California, San Diego under
   the Cooperative Association for Internet Data Analysis (CAIDA)
   Program. Support for this effort is provided by the CAIDA grant
   NCR-9711092, and by CAIDA members.

After fetching the coral release from:

   http://www.caida.org/tools/measurement/coralreef/dists/coral-3.4.1-public.tar.gz

install ParseBGPDump.pm in FlowScan's perl include path, such as in the bin sub-directory:

   $ cd /tmp
   $ gunzip -c coral-3.4.1-public.tar.gz |tar x coral-3.4.1-public/./libsrc/misc-perl/ParseBGPDump.pm
   $ mv coral-3.4.1-public/./libsrc/misc-perl/ParseBGPDump.pm $PREFIX/bin/ParseBGPDump.pm

Also you must specify TopN to be greater than zero, e.g. 10, and the HTML::Table perl module is required if you do so.

The BGPDumpFile value is the name of a file containing the output of show ip bgp from a Cisco router, ideally from the router that is exporting flows. If this option is used, and the specified file exists, it will cause the ``originAS'' and ``pathAS'' reports to be generated. E.g.:

   TopN 10
   BGPDumpFile etc/router.our.domain.bgp

One way to create the file itself, is to set up rsh access to your Cisco, e.g.:

   ip rcmd rsh-enable
   ip rcmd remote-host username 10.10.42.69 username

Then do something like this:

   $ cd $PREFIX
   $ mkdir etc
   $ echo show ip bgp >etc/router.our.domain.bgp # required by ParseBGPDump.pm
   $ time rsh router.our.domain "show ip bgp" >>etc/router.our.domain.bgp
      65.65s real     0.01s user     0.05s system
   $ wc -l /tmp/router.our.domain.bgp
    197883 /tmp/router.our.domain.bgp

Once flowscan is up and running with BGPDumpFile configured, it will reload that file if its timestamp indicates that it has been modified. This allows you to ``freshen'' the image of the routing table without having to restart flowscan itself.

Using the BGPDumpFile option causes FlowScan to use much more memory than usual. This memory is used to store a Net::Patricia trie containing a node for every prefix in the BGP routing table. For instance, on my system it caused the FlowScan process to grow to over 50MB, compared to less than 10MB without BGPDumpFile configured.

ASNFile
This directive is optional and is only useful in conjunction with BGPDumpFile. If specified, this directive will cause the AS names rather than just their numbers to appear in the Top ASN HTML reports. Its value should be the path to a file having the format of the file downloaded from this URL:
   ftp://ftp.arin.net/netinfo/asn.txt

E.g.:

   TopN 10
   BGPDumpFile etc/router.our.domain.bgp
   ASNfile etc/asn.txt

Once flowscan is up and running with ASNFile configured, it will reload the file if its timestamp indicates that it has been modified.


METHODS

This module provides no public methods. It is a report module meant only for use by flowscan. Please see the FlowScan module documentation for information on how to write a FlowScan report module.


SEE ALSO

perl(1), FlowScan, CampusIO, SubNetIO, flowscan(1), Net::Patricia.


BUGS

See CampusIO.pm bugs.


AUTHOR


Dave Plonka <plonka@doit.wisc.edu>
Stanislav Sinyagin <ssinyagin@yahoo.com>

Copyright (C) 1998-2001  Dave Plonka.
Copyright (C) 2002 Cablecom GmbH

This program source is based on CampusIO.pm. It was developed by the order of Cablecom GmbH (www.cablecom.ch).

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.


VERSION

The version number is the module file RCS revision number ($Revision: 1.6 $) with the minor number printed right justified with leading zeroes to 3 decimal places. For instance, RCS revision 1.1 would yield a package version number of 1.001.

This is so that revision 1.10 (which is version 1.010), for example, will test greater than revision 1.2 (which is version 1.002) when you want to require a minimum version of this module.