Git Product home page Git Product logo

bagheera's Introduction

Bagheera

Version: 0.9-SNAPSHOT

REST service for Mozilla Metrics. This service currently uses Apache Kafka as its backing data store, then provides a few implementations of Kafka consumers to pull and persist to various data sinks.

Version Compatability

This code is built with the following assumptions. You may get mixed results if you deviate from these versions.

Prerequisites

  • Protocol Buffers
  • Zookeeper (for Kafka)
  • Kafka
  • Hadoop (if using HDFS based consumer)
  • HBase (if using HBase based consumer)

Building

To make a jar you can do:

mvn package

The jar file is then located under target.

Running an instance

Make sure your Kafka and Zookeeper servers are running first (see Kafka documentation)

In order to run bagheera on another machine you will probably want to use the dist assembly like so:

mvn assembly:assembly

The zip file now under the target directory should be deployed to BAGHEERA_HOME on the remote server.

To run Bagheera you can use bin/bagheera or copy the init.d script by the same name from bin/init.d to /etc/init.d. The init script assumes an installation of bagheera at /usr/lib/bagheera, but this can be modified by changing the BAGHEERA_HOME variable near the top of that script. Here is an example of using the regular bagheera script:

bin/bagheera 8080

REST Request Format

#####URI /submit/namespace/id | /1.0/submit/namespace/id##### POST/PUT

  • The namespace is required and is only accepted if it is in the configured white-list.
  • The id is optional although if you provide it currently it needs to be a valid UUID unless id validation is disabled on the namespace.
  • The payload content length must be less than the configured maximum.

DELETE

  • The namespace is required and is only accepted if it is in the configured white-list.
  • The id is required although if you provide it currently it needs to be a valid UUID unless id validation is disabled on the namespace.

Here's the list of HTTP response codes that Bagheera could send back:

  • 201 Created - Returns the id submitted/generated. (default)
  • 403 Forbidden - Violated access restrictions. Most likely because of the method used.
  • 413 Request Too Large - Request payload was larger than the configured maximum.
  • 400 Bad Request - Returned if the POST/PUT failed validation in some manner.
  • 404 Not Found - Returned if the URI path doesn't exist or if the URI was not in the proper format.
  • 500 Server Error - General server error. Someone with access should look at the logs for more details.

Example Bagheera Configuration (conf/bagheera.properties)

# valid namespaces (whitelist only, comma separated)
valid.namespaces=mynamespace,othernamespace
max.content.length=1048576

Example Kafka Producer Configuration (conf/kafka.producer.properties)

# comma delimited list of ZK servers
zk.connect=127.0.0.1:2181
# use bagheera message encoder
serializer.class=com.mozilla.bagheera.serializer.BagheeraEncoder
# asynchronous producer
producer.type=async
# compression.code (0=uncompressed,1=gzip,2=snappy)
compression.codec=2
# batch size (one of many knobs to turn in kafka depending on expected data size and request rate)
batch.size=100

Example Kafka Consumer Configuration (conf/kafka.consumer.properties)

# kafka consumer properties
zk.connect=127.0.0.1:2181
fetch.size=1048576
#serializer.class=com.mozilla.bagheera.serializer.BagheeraDecoder
# bagheera specific kafka consumer properties
consumer.threads=2

Notes on consumers

We currently use the consumers implemented here, but it may also be of interest to look at systems such as Storm to process the messages. Storm contains a Kafka spout (consumer) and there are at least a couple of HBase bolts (processing/sink) already out there.

License

All aspects of this software are distributed under Apache Software License 2.0. See LICENSE file for full license text.

Contributors

bagheera's People

Contributors

xstevens avatar mreid-moz avatar rnewman avatar x1b avatar deinspanjer avatar meyarivan avatar anuragphadke avatar

Watchers

James Chang avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.