Git Product home page Git Product logo

realtime-analytics's Introduction

Infra Setup

Create namespaces

# 1) create namespaces
kubectl create namespace database
kubectl create namespace processing
kubectl create namespace datastore

Deploy Flink Operator

cd _infra/flink

# Deploy cert-manager
kubectl create -f https://github.com/jetstack/cert-manager/releases/download/v1.8.2/cert-manager.yaml

# Deploy Flink Helm chart
helm repo add flink-operator-repo https://downloads.apache.org/flink/flink-kubernetes-operator-1.7.0
helm install \
      --namespace "processing" \
      --debug \
      --wait=false  \
      "flink" -f values.yaml .

# Submit a testing job
kubectl create -f https://raw.githubusercontent.com/apache/flink-kubernetes-operator/release-1.8/examples/basic.yaml

Deploy MySQL data generator database

# Deploy MySQL helm chart
cd _infra/mysql
helm install \
      --namespace "database" \
      --debug \
      --wait=false  \
      "mysql" -f values.yaml .

# Get pass
MYSQL_ROOT_PASSWORD=$(kubectl get secret --namespace database mysql -o jsonpath="{.data.mysql-root-password}" | base64 -d)

# Port-forward
kubectl port-forward svc/mysql 3306:3306 -n database

Install Apache Pinot

# Deploy Apache Pinot
cd _infra/pinot
helm install \
      --namespace "datastore" \
      --debug \
      --wait=false  \
      "pinot" -f values.yaml .

Install Strimzi Operator and Kafka

# Deploy Strimzi Operator 
cd _infra/strimzi
helm install \
      --namespace "processing" \
      --debug \
      --wait=false  \
      "kafka" -f values.yaml .

# Install Kafka Broker
kubectl apply -f strimzi/kafka/broker.yaml -n processing

# Install KafkaConnect
kubectl apply -f strimzi/kafka/kafka-connect.yaml -n processing

# Install Schema Registry
helm upgrade --install \
      --namespace "processing" \
      --debug \
      --wait=false  \
      "schema-registry" -f values.yaml .

# Add as many as KafkaConnector you need
kubectl apply -f strimzi/kafka/connectors/mysql-connector.yaml

Validate Kafka Topics

# Run kafka cli commands underneath the pod
kubectl exec -it stream-kafka-0 -n processing -- /bin/bash
bin/kafka-topics.sh --list --bootstrap-server stream-kafka-bootstrap:9092

# List Topics
kubectl exec -it stream-kafka-0 -n processing -- bin/kafka-topics.sh --list --bootstrap-server stream-kafka-bootstrap.processing.svc.cluster.local:9092

# Read Data from Raw Topics
kubectl exec -it stream-kafka-0 -n processing -- bin/kafka-console-consumer.sh --bootstrap-server stream-kafka-bootstrap.processing.svc.cluster.local:9092 --topic mysql_retail_addresses --from-beginning
kubectl exec -it stream-kafka-0 -n processing -- bin/kafka-console-consumer.sh --bootstrap-server stream-kafka-bootstrap.processing.svc.cluster.local:9092 --topic mysql_retail_customers --from-beginning
kubectl exec -it stream-kafka-0 -n processing -- bin/kafka-console-consumer.sh --bootstrap-server stream-kafka-bootstrap.processing.svc.cluster.local:9092 --topic mysql_retail_order_items --from-beginning
kubectl exec -it stream-kafka-0 -n processing -- bin/kafka-console-consumer.sh --bootstrap-server stream-kafka-bootstrap.processing.svc.cluster.local:9092 --topic mysql_retail_orders --from-beginning
kubectl exec -it stream-kafka-0 -n processing -- bin/kafka-console-consumer.sh --bootstrap-server stream-kafka-bootstrap.processing.svc.cluster.local:9092 --topic mysql_retail_products --from-beginning

# Read Data from Enriched Topics
kubectl exec -it stream-kafka-0 -n processing -- bin/kafka-console-consumer.sh --bootstrap-server stream-kafka-bootstrap.processing.svc.cluster.local:9092 --topic enriched_orders --from-beginning

# Delete Topics
kubectl exec -it stream-kafka-0 -n processing -- bin/kafka-topics.sh --delete --bootstrap-server stream-kafka-bootstrap.processing.svc.cluster.local:9092 --topic enriched_orders

# Count Topics rows
# Final Offsets
kubectl exec -it stream-kafka-0 -n processing -- \
  bin/kafka-run-class.sh kafka.tools.GetOffsetShell \
  --broker-list stream-kafka-bootstrap.processing.svc.cluster.local:9092 \
  --topic enriched_orders \
  --time -1

# Initial Offsets
kubectl exec -it stream-kafka-0 -n processing -- \
  bin/kafka-run-class.sh kafka.tools.GetOffsetShell \
  --broker-list stream-kafka-bootstrap.processing.svc.cluster.local:9092 \
  --topic enriched_orders \
  --time -2

realtime-analytics's People

Contributors

fabianofpena avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.