Distributed Monitoring and Data Analysis Solution

An overview in our high-performance traffic monitoring system

Blockmon

Blockmon provides a set of units called blocks, each of which performs a certain discrete processing action, for instance parsing a DNS response, or counting the number of distinct VoIP users on a link. The blocks communicate with each other by passing messages via gates; one block's output gates are connected to the input gates of other blocks, which allows runtime indirection of messages. A set of inter-connected blocks implementing a measurement application is called a composition. A generic composition is shown in the figure below.

Figure 1: composition showing inter-connectied Blockmon's blocks by means of input and output (I/O) gates.

Compositions are defined using an XML format which lists the blocks and their configuration parameters as well as the connections among their gates. The XML corresponding to the generic example composition is shown in figure 2. The Blockmon core and the blocks themselves are implemented in C++, and the system is controlled at runtime using a simple, Python-based command-line interface (CLI). Blockmon also comes with a GUI that allows users to draw, generate and install compositions.

<composition id="example">
  <block id="source" type="SourceBlock">
      <params>
        <source type="live" name="eth0"/>
      </params>
  </block>

  <block id="a" type="BlockA"/>
  <block id="b" type="BlockB"/>
  <block id="c" type="BlockC"/>

  <block id="export" type="ExportBlock">
    <params>
      <dst ip="192.0.2.20" port="4739"/>
    </params>
  </block>

  <connection src_blk="source" src_gate="pkt_out"
              dst_block="a" dst_gate="in"/>

  <connection src_blk="source" src_gate="pkt_out"
              dst_block="b" dst_gate="in"/>

  <connection src_blk="a" src_gate="out"
              dst_block="c" dst_gate="in"/>

  <connection src_blk="b" src_gate="out"
              dst_block="c" dst_gate="in"/>

  <connection src_blk="c" src_gate="out"
              dst_block="export" dst_gate="rec_in"/>
</composition>
Figure 2: XML corresponding to composition in figure 1.


Blocks

A block performs a discrete processing action. Blocks can implement a wide range of functionality including packet capture and filtering, monitoring, anomaly detection algorithms and export capabilities. A sample of the blocks available in the base Blockmon distribution as of this writing are described in the table below, and the list is always growing.

Block Name Description
PcapSourceCaptures packets from a local interface or pcap trace files.
PFRingSourceCaptures packets from PF_RING sockets. Supports multi-queue NICs.
PFQSourceCaptures packets using PFQ. Supports multi-queue NICs.
ComboSZE2SourceCaptures packets from an INVEA-TECH COMBO card.
IPFIXExporterTranscodes messages to IPFIX records, and exports them.
IPFIXCollectorCollects data via IPFIX and generates messages for appropriate records.
PacketFilterFilters packets based on packet header fields.
PacketPrinterPrints packets for debugging purposes.
PacketCounterCounts received packets for debugging purposes.
IPAnonAnonymizes the source and destination IP addresses of a packet.
FlowMeterAssembles packets into flows with natural lifetime export.
PeriodFlowMeterAssembles packets into flows with periodic export.
Table 1: Examples of available Blockmon blocks.

All blocks are derived from a common superclass. New blocks simply inherit from this class and implement at least two methods: configure, which receives XML representing the block's configuration parameters, and receive_msg which is called when a message arrives at the block. Blocks can also be invoked on periodic or one-shot timers via the handle_timer method, and can perform high-frequency but non-periodic asynchronous work in the do_async method; this last method is mainly provided for source blocks (e.g., packet capture or message import via IPFIX), which send messages but do not receive them.

Gates and Scheduling

A gate is essentially a named point on a block to allow connections between blocks at configuration time: compositions are built by defining connections between specific gates on one block and a specific gate on another. Blocks send messages via output gates, and receive messages via input gates.

There are in essence two types of input gates, which lend Blockmon its scheduling flexibility. Blockmon supports direct and indirect message passing. In the former, the sending block directly calls the receiving block's receive_msg method: the input gate is in this case essentially a function call. This is fast but inflexible: the receiving block runs in the sending block's thread, which will be busy with the receiving block until it finishes. This head-of-line blocking can pass all the way up the chain of directly invoked blocks, so chains of direct invocation should be avoided by using indirect message passing.

Indirect message passing is mediated by a novel wait-free, rotating queue. With indirect message passing, each block is separately scheduled in different thread pools on different CPU cores; this allows truly parallel processing on multi-core systems without blocking or locking overhead, key to Blockmon's performance. These queues can also buffer messages to avoid packet loss during peak load.

The two message passing models can be mixed by compositions to maximize performance. Blocks which benefit from parallelization (whether they implement parallelizable problems, or perform CPU intensive work) should generally be indirectly invoked, while thin, stateless "filter" blocks may benefit from direct invocation. Source blocks (such as packet capturers) must always be indirectly invoked via do_async. For implementation reasons, all input gates on a given block must be of the same type.

Figure 3: example composition showing block invocation.

To make things more concrete, figure 3 shows an example of how the different block and scheduling types are used in a simple composition. In this case, PcapSource is a source block and therefore indirectly scheduled, capturing packets from a network interface in its do_async method, and sending Packet messages directly to UDPFilter. This block filters for UDP packets, and sends results directly to StatsTable, which keeps statistics for received packets. All three of these blocks run in the same thread pool; that is, when a packet is received, PcapSource creates a Packet message and invokes receive_msg on UDPFilter, which invokes receive_msg on StatsTable if the packet is a UDP packet.

StatsTable registers a periodic timer on configuration; Blockmon's scheduler then periodically calls handle_timer on StatsTable, sending a message containing statistics to IPFIXExporter. This block exports the results. It is indirectly scheduled, and runs in a separate thread pool from the source/counter in order to isolate network export from packet capture and counting. Periodically, the scheduler will dequeue the pending messages from the rotating queue associated with the input gate on IPFIXExporter, and invoke receive_msg for each.

Messages

The actual communication between blocks is in terms of messages. These, like blocks, are derived from a common superclass; pointers to messages are passed via the gates. The Message class provides a basic interface for identifying message types, and for supporting import and export of messages in order to connect compositions across nodes. Messages are constant in order to ensure that they can be shared without contention among multiple blocks concurrently, and provide a tagging mechanism to allow Blocks to add small bits of data to a message in a thread-safe manner without incurring too large a performance overhead.

Messages within Blockmon dealing with timing have access to a packet clock derived from captured packet timestamps, to ensure that pe