Fork me on GitHub

Prototype your IP flow analysis right now!

An automated framework for rapid prototyping of IP flow analysis based on world-leading technologies for real-time data processing, network traffic monitoring, and visualization.

Download available on GitHub

Framework Features

Full Stack Solution

The framework provides full stack solution for IP flow analysis prototyping. It is possible to connect to majority of IP flow network probes. The framework integrates tools for data collection, data processing, manipulation, storage, and presentation.

High Performance

Thanks to the scalability of the framework, it is fitted for processing network traffic in a wide range of networks from small company network to large-scale, high-speed networks of ISPs. Its distributed nature enables computationally intensive analyses.

Easy Deployment

The deployment of the framework is fully automated for cloud deployment using cutting edge technologies for software orchestration. The deployment comes with example prototype applications and initial tests to further ease the prototype development.

Real-time Analysis

The stream-based approach provides results of IP flow analysis prototype with only a few seconds delay. The results can be explored in various ways in a user interface in real time. IP analysis prototype can be immediately improved according to provided results.

Architecture

The basis of the Stream4Flow framework is formed by the IPFIXCol collector, Kafka messaging system, Apache Spark, and Elastic Stack. IPFIXCol enables incoming IP flow records to be transformed into the JSON format provided to the Kafka messaging system. The selection of Kafka was based on its scalability and partitioning possibilities, which provide sufficient data throughput. Apache Spark was selected as the data stream processing framework for its quick IP flow data throughput, available programming languages (Scala, Java, or Python) and MapReduce programming model. The analysis results are stored in Elastic Stack containing Logstash, Elasticsearch, and Kibana, which enable storage, querying, and visualizing the results. The Stream4Flow framework also contains the additional web interface in order to make administration easier and visualize complex results of the analysis.

More on stream-based IP flow analysis is described in our paper Toward Stream-Based IP Flow Analysis.

IPFIXcol

A framework for the complex processing of IP flows information from multiple different sources. The IPFIXcol contains a number of tools for offline data processing and can be used as an advanced substitution for nfdump.

Explore

Apache Kafka

Kafka is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies.

Explore

Apache Spark

Apache Spark is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs.

Explore

Elastic Stack

The open source Elastic Stack (that’s Elasticsearch, Kibana, Logstash) helps you take data from any source, any format and search, analyze, and visualize it in real time.

Explore

Our Team

We are part of the security team of Masaryk University (CSIRT-MU), which is responsible for developing and maintaining of proper ICT security at the university. The team has seven years’ experience in security incident handling, network monitoring, and deals with thousands of security incidents a year.

Tomas Jirsik

Host Monitoring

Milan Cermak

Anomaly Detection

Daniel Tovarnak

Distributed Computing

Contact

Our address:

CSIRT-MU
Institute of Computer Science, Masaryk University
Botanicka 68a, 602 00 Brno, Czech Republic

E-mail:

stream4flowfoo@ics.muni.cz

Our Website:

https://csirt.muni.cz

Contact form:

Partnerships