Git Product home page Git Product logo

sparta's Introduction

Coverage Status

At Stratio, we have implemented several real-time analytics projects based on Apache Spark, Kafka, Flume, Cassandra, ElasticSearch or MongoDB. These technologies were always a perfect fit, but soon we found ourselves writing the same pieces of integration code over and over again. Stratio Sparta is the easiest way to make use of the Apache Spark Streaming technology and all its ecosystem. Choose your input, operations and outputs, and start extracting insights out of your data in real-time.

Strata Twitter Analytics with Kibana

Main Features

  • Pure Spark
  • No need of coding, only declarative analytical workflows
  • Data continuously streamed in & processed in near real-time
  • Ready to use out-of-the-box
  • Plug & play: flexible workflows (inputs, outputs, transformations, etc…)
  • High performance and Fault Tolerance
  • Scalable and High Availability
  • Big Data OLAP on real-time to small data
  • ETLs
  • Triggers over streaming data
  • Spark SQL language with streaming and batch data
  • Kerberos and CAS compatible

Main Features

Architecture

Send one workflow as a JSON to Sparta API and execute in one Spark Cluster your own real-time plugins Architecture

Sparta as a Job Manager

Send more than one Streaming Job in the Spark Cluster and manage them with a simple UI

Job Manager

Run workflows over Mesos, Yarn or SparkStandAlone

Job Manager Architecture

Sparta as a SDK

Modular components extensible with simple SDK

  • You can extend several points of the platform to fulfill your needs, such as adding new inputs, outputs, operators, transformations.
  • Add new functions to Kite SDK in order to extend the data cleaning, enrichment and normalization capabilities. Architecture Detail

Components

On each workflow multiple components can be defined, but now all have the following architecture workflow Components

Core components

Several plugins are been implemented by Stratio Sparta team Main plugins

Trigger component

With Sparta is possible to execute queries over the streaming data, execute ETL, aggregations and Simple Event Processing mixing streaming data with batch data on the trigger process. triggers

Aggregation component

The aggregation process in Sparta is very powerful because is possible to generate efficient OLAP processes with streaming data OLAP

Advanced feature are been implemented in order to optimize the stateful operations over Spark Streaming Aggregations

Inputs

  • Twitter
  • Kafka
  • Flume
  • RabbitMQ
  • Socket
  • WebSocket
  • HDFS/S3

Outputs

  • MongoDB
  • Cassandra
  • ElasticSearch
  • Redis
  • JDBC
  • CSV
  • Parquet
  • Http
  • Kafka
  • HDFS/S3
  • Http Rest
  • Avro

Outputs

Key technologies

Advantages

Sparta provide several advantages to final Users Advantages

Build

You can generate rpm and deb packages by running:

mvn clean package -Ppackage

Note: you need to have installed the following programs in order to build these packages:

In a debian distribution:

  • fakeroot
  • dpkg-dev
  • rpm

In a centOS distribution:

  • fakeroot
  • dpkg-dev
  • rpmdevtools

Documentation

License

Licensed to STRATIO (C) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The STRATIO (C) licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

sparta's People

Contributors

compae avatar emgaitan-stratio avatar alexrchies avatar mariostratio avatar aalfonso-stratio avatar danielcsant avatar anistal avatar gschiavon avatar dcarroza-stratio avatar sgomezg avatar ajnavarro avatar witokondoria avatar smola avatar becaresss avatar eambrosio avatar mtelloz avatar ahvargas avatar gasparms avatar stratiocommit avatar fjsc avatar roclas avatar gjimenez-stratio avatar tomasperezv avatar alvsanand avatar dvallejo avatar zhongl avatar pedrogutierrezstratio avatar

Watchers

James Cloos avatar kmchu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.