Git Product home page Git Product logo

f2k's Introduction

Build Status Coverage Status

Flow 2 Kafka (f2k)

Netflow to Json/Kafka collector.

Setup

To use it, you only need to do a typical ./configure && make && make install

Usage

Basic usage

The most important configuration parameters are:

  • Output parameters:

    • --kafka=127.0.0.1@rb_flow, broker@topic to send netflow
  • Input parameters: Can be either UDP port or Kafka topic

    • --collector-port=2055, Collector port to listen netflow
    • --kafka-netflow-consumer=kafka@rb_flow_pre, Kafka host/topic to listen for netflow
  • Configuration

    • --rb-config=/opt/rb/etc/f2k/config.json, File with sensors config (see Sensor config)

Sensors config

You need to specify each sensor you want to read netflow from in a JSON file:

{
	"sensors_networks": {
		"4.3.2.1":{
			"observations_id": {
				"1":{
					"enrichment":{
						"sensor_ip":"4.3.2.1",
						"sensor_name":"flow_test",
						"observation_id":1
					}
				}
			}
		}
	}
}

With this file, you will be listening for netflow coming from 4.3.2.1 (this could be a network too, 4.3.2.0/24), and the JSON output will be sent with that sensor_ip, sensor_name and observation_id keys.

Others configuration parameters

Multi-thread

--num-threads=N can be used to specify the number of netflow processing threads.

Long flow separation

Use --separate-long-flows if you want to divide flow with duration>60s into minutes. For example, if the flow duration is 1m30s, f2k will send 1 message containing 2/3 of bytes and pkts for the minute, and 1/3 of bytes and pkts to the last 30 seconds, like if we had received 2 different flows.

(see Test 0017 for more information about how flow are divided)

Geo information

f2k can add geographic information if you specify Maxmind GeoLite Databases location using:

  • --as-list=/opt/rb/share/GeoIP/asn.dat,
  • --country-list=/opt/rb/share/GeoIP/country.dat,

Names resolution

You can include more flow information, like many object names, with the option --hosts-path=/opt/rb/etc/objects/. This folder needs to have files with the provided names in order to f2k read them.

Mac vendor information (mac_vendor)

With --mac-vendor-list=mac_vendors f2k can translate flow source and destination macs, and they will be sending in JSON output as in_src_mac_name, out_src_mac_name, and so on.

The file mac_vendors should be like:

FCF152|Sony Corporation
FCF1CD|OPTEX-FA CO.,LTD.
FCF528|ZyXEL Communications Corporation

And you can generate it using make manuf, that will obtain it automatically from IANA Registration Authority.

Applications/engine ID (applications, engines)

f2k can translate applications and engine ID if you specify a list with them, like:

  • <hosts-path>/engines

    None            0
    IANA-L3         1
    PANA-L3         2
    IANA-L4         3
    PANA-L4         4
    ...
    
  • <hosts-path>/applications

    3com-amp3                 50332277
    3com-tsmux                50331754
    3pc                       16777250
    914c/g                    50331859
    ...
    

Hosts, domains, vlan (hosts, http_domains, vlans)

You can include more information about the flow source and destination ( src_name and dst_name) using a hosts list, using the same format as /etc/hosts. The same can be used with files vlan, domains, macs.

Netflow probe nets

You can specify per netflow probe home nets, so they will be taking into account when solving client/target IP.

You could specify them using home_nets:

"sensors_networks": { "4.3.2.0/24":{ "2055":{
	"sensor_name":"test1",
	"sensor_ip":"",
	"home_nets": [
	        {"network":"10.13.30.0/16", "network_name":"users" },
	        {"network":"2001:0428:ce00:0000:0000:0000:0000:0000/48",
	        				"network_name":"users6"}
	],
}}}

DNS

f2k can make reverse DNS in order to obtain some hosts names. To enable them, you must use:

  • enable-ptr-dns, general enable
  • dns-cache-size-mb, DNS cache to not repeat PTR queries
  • dns-cache-timeout-s, Entry cache timeout

Template cache

Using folder

You can specify a folder to save/load templates using --template-cache=/opt/rb/var/f2k/templates.

If you want to use zookeeper to share templates between f2k instances, you can specify zookeeper host using --zk-host=127.0.0.1:2181 and a proper timeout to read them with --zk-timeout=30. Note that you need to compile f2k using --enable-zookeeper.

librdkafka options

All librdkafka options. can be used using -X (flow producer), Y (flow consumer), or Z (flow discarder) parameter. The argument will be passed directly to librdkafka config, so you can use whatever config you need.

Example:

--kafka-discarder=kafka@rb_flow_discard     # Define host and topic
--kafka-netflow-consumer=kafka@rb_flow_pre  # Define host and topic
-X=socket.max.fails=3                       # Define configuration for flow producer
-X=retry.backoff.ms=100                     # Define configuration for flow producer
-Y=group.id=f2k                             # Define configuration for consumer
-Z=group.id=f2k                             # Define configuration for discard producer

Recommended options are:

  • socket.max.fails=3,
  • delivery.report.only.error=true,

f2k's People

Contributors

eugpermar avatar bigomby avatar arodriguezdlc avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.