Git Product home page Git Product logo

kafka-iot-data-processor's Introduction

kafka-iot-data-processor

Kafka project
arc

  • A pipeline architecture that consists of well-known open source tools to specifically integrate internet of things (IoT) data streams.
  • The code demonstrate how to deal with IoT stream data that is generated by device equipped with sensing components(IOT devices) by utilizing open source tools and frameworks to tackle data integration as well as scalable stream processing.
  • The architecture consists of several tools that are chained to build a pipeline. Apache Kafka among others a suitable intermediate tool for the data integration kicks in by triggering a schedule task which send out a value every second mimicing an IOT device as the entry point of published messages.
  • These messages are then forwarded to a Kafka broker running locally in order to fault-tolerant stream processing and parallel topic consumption.
  • Kafka Consumer is β€Šthe service that will be responsible for reading messages processing them according to the needs of your own business logic which in our case storing them in DB.
  • Big Data, for data processing Hadoop is a framework that can process large data sets across clusters; Spark is a unified analytics engine for large scale data processing. Apache storm is also available for big data analytics, data processing and can be combined together but is not considered for the scope of this project development.
  • To implement the data storageto Spring Data JPA, In memory H2DB has been used, which consume messages from the Kafka broker. Finally, the produced data of the IoT is stored in our DB for further processing.
  • The user can query the readings of specific sensor groups.

Tech stack

  • Apache kafka 2+
  • Java 11
  • Spring Boot 2+
  • Spring security
  • JPA
  • In Memory DB
  • Maven build

Running Instructions Locally

Prerequisites:

Runnign the project

  • From command prompt Go to C:\kafka
  • First statr the zookeeper cmd: .\bin\windows\zookeeper-server-start.bat .\config\zookeeper.properties
  • Next start kafka server cmd: .\bin\windows\kafka-server-start.bat .\config\server.properties
  • If you see error in Kafka server such then go to config/server.properties and add this listeners=PLAINTEXT://localhost:9092 is what the broker will use to create server sockets.
  • If you see error in Kafka server such as org.apache.kafka.clients.NetworkClient then go to config/server.properties and add this advertised.listeners=PLAINTEXT://localhost:9092 is what clients will use to connect to the brokers.
  • Next run the spring boot project from an IDE

Approach

  • The API call to '/start' endpoint will trigger scheduling which will enable 3 IOT devices to send out a value every second.
  • Until the API call '/stop' is called the IOT devices keep sending out a value every second.
  • The IOT device data is processed in parallel and stored in DB
  • The user can query readings (e.g average/median/max/min values) of specific IOT sensors
  • The end poioint for user to query reading is secured with basic authentication and requireds passing of credentials to get readings data.

Limitatin

  • Not a limitation, at this moment the user cannot be able to querying the readings (e.g average/median/max/min values) of specific sensors or groups of sensors for a specific timeframe.
  • Additional features can be added.

API Documentation

Base URL: http://localhost:8080/processor
Operations:

No Operation Endpoint Method
1 start scheduling /start POST
2 stop scheduling /stop POST
3 get query readings /iotdata (protected) GET

1. start scheduling

  • URI: /start
  • Method: POST

Request Body : None
Response Body : Started Scheduling

2. stop scheduling

  • URI: /stop
  • Method: POST

Request Body : None
Response Body : Stopped Scheduling

3. get query readings (protected: Requires authentication)

  • URI: /iotdata
  • Method: GET
  • Authentication type: Basic
  • username: admin
  • password: password

Request Body
Attributes Type Validation Required
deviceType ENUM THERMOSTAT_METER/HEART_METER/CARFUEL_METER yes
queryType ENUM AVERAGE/MAX/MIN yes
{
    "deviceType": "HEART_METER",
    "queryType": "MIN"
}

Response

Attributes Type
deviceType ENUM
queryType ENUM
queryValue int
{
    "queryType": "MIN",
    "deviceType": "HEART_METER",
    "queryValue": "62"
}

Notes:

  • The project is a prototype for demo purpose.
  • It does requires standardizing.

Useful commands:

  • Create topics
    .\bin\windows\kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic THERMOSTAT_METER

.\bin\windows\kafka-topics.bat --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic HEART_METER

  • List the topics
    .\bin\windows\kafka-topics.bat --list --zookeeper localhost:2181

  • Produce data to a topic
    .\bin\windows\kafka-console-producer --broker-list localhost:9092 --topic THERMOSTAT_METER

  • Consume data from a topic
    .\bin\windows\kafka-console-consumer.bat --bootstrap-server localhost:9092 --topic THERMOSTAT_METER --from-beginning

kafka-iot-data-processor's People

Contributors

vishnuvuyyur1 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.