Git Product home page Git Product logo

basemax / kafkadataprocessinggo Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 0.0 46 KB

Welcome to the Kafka Data Processing with Go project! This project showcases how to use Apache Kafka in combination with the Go programming language to build a data processing application.

License: GNU General Public License v3.0

Dockerfile 1.29% Makefile 9.88% Go 88.84%
ecommerce go kafka go-kafka golang-kafka grafana grafana-dashboard grafana-dashboards grafana-datasource grafana-panel

kafkadataprocessinggo's Introduction

Kafka Data Processing with Go

Welcome to the Kafka Data Processing with Go project! This project showcases how to use Apache Kafka in combination with the Go programming language to build a data processing application. By following this example, you'll learn how to produce and consume data using Kafka topics, allowing you to develop scalable and efficient data processing pipelines.

In this project, we'll create a data processing application using the Go programming language and Apache Kafka. Imagine you're building a system that processes user activity data from a website and performs real-time analytics on it. Kafka will serve as the backbone for data streaming, enabling the efficient transfer of data between different components.

Prerequisites

Before you begin, make sure you have the following prerequisites:

  • Go programming language (Installation guide: Getting Started with Go)
  • Apache Kafka (Installation guide: Kafka Quickstart)
  • Git

Setup

Clone this repository:

git clone https://github.com/basemax/KafkaDataProcessingGo.git
cd KafkaDataProcessingGo

Start the Kafka server and create the necessary topics (assuming you've already installed Kafka):

# Start the ZooKeeper server (if not already started)
bin/zookeeper-server-start.sh config/zookeeper.properties

# Start the Kafka server
bin/kafka-server-start.sh config/server.properties

# Create the required topics
bin/kafka-topics.sh --create --topic activities --bootstrap-server localhost:9092 --partitions 3 --replication-factor 1

Start prometheus server and config like this:

scrape_configs:
  - job_name: KafkaProcessing
    metrics_path: /metrics
    honor_labels: false
    honor_timestamps: true
    scheme: http
    scrape_interval: 1s
    follow_redirects: true
    body_size_limit: 0
    sample_limit: 0
    label_limit: 0
    label_name_length_limit: 0
    label_value_length_limit: 0
    target_limit: 0
    static_configs:
      - targets:
          - "HOSTNAME:8000"

You can optionally using Grafana. Grafana configurations placed on grafana/ directory.

Usage

This project consists of two main components: Producer and Consumer.

Run via docker

Docker setup

You can change .env configurations.

Start app

Start application and it's dependencies using docker compose.

make run

or

docker-compose up -d

You can check some endpoints to ensure checking health of system.

  • localhost:8000/metrics - Consumer application
  • localhost:29092 - Kafka server
  • localhost:9090 - Prometheus server
  • localhost:3000 - Grafana server

Config Grafana

Import datasource configurations using following command:

make grafana_import_ds

Import Grafana dashboards manually. You can copy json configuration on grafana/dashboards/ directory and import it to Grafana.

NOTE: If you wanna build application image multiple times you can use go mod vendor command to keeping dependencies on container. With this technique build process will speed up.

Producer

The producer generates mock user activity data and sends it to the Kafka topic. To run the producer:

go run . producer

Use -f option for creating fake delay on publishing Kafka messages.

The producer will continuously generate and send user activity data to the Kafka topic.

Consumer

The consumer subscribes to the Kafka topic, processes the user activity data, and performs analytics. To run the consumer:

go run . consumer

The consumer will listen for incoming user activity data and process it accordingly.

Faker

The faker create sample dataset for producer.

go run . faker

Sample Dataset

This application deals with tracking user activities on an e-commerce website. Here's a simple example of a JSON-based user activity dataset:

[
  {
    "user_id": "user123",
    "timestamp": "2023-08-18T10:00:00Z",
    "action": "view",
    "product_id": "prod456"
  },
  {
    "user_id": "user456",
    "timestamp": "2023-08-18T11:30:00Z",
    "action": "add_to_cart",
    "product_id": "prod123"
  },
  {
    "user_id": "user789",
    "timestamp": "2023-08-18T12:15:00Z",
    "action": "purchase",
    "product_id": "prod789"
  },
  // More entries...
]

You can create a sample dataset file like sample_data.json in the root directory of your project with multiple such entries.

Please note that this is just a basic representation, and you can extend it with additional fields and more complex data as needed for your application.

For generating a larger dataset, you might consider using libraries like Faker (for generating realistic fake data) in combination with Go's built-in JSON handling capabilities. Here's a rough example of how you could generate a larger dataset using Faker:

package main

import (
	"encoding/json"
	"fmt"
	"os"
	"time"

	"github.com/bxcodec/faker/v3"
)

type UserActivity struct {
	UserID     string    `json:"user_id"`
	Timestamp  time.Time `json:"timestamp"`
	Action     string    `json:"action"`
	ProductID  string    `json:"product_id"`
}

func main() {
	var activities []UserActivity

	for i := 0; i < 1000; i++ {
		activity := UserActivity{
			UserID:     faker.UUIDHyphenated(),
			Timestamp:  faker.DateUnix(),
			Action:     faker.RandomChoice([]string{"view", "add_to_cart", "purchase"}),
			ProductID:  faker.UUIDHyphenated(),
		}
		activities = append(activities, activity)
	}

	file, err := os.Create("sample_data.json")
	if err != nil {
		fmt.Println("Error creating file:", err)
		return
	}
	defer file.Close()

	encoder := json.NewEncoder(file)
	encoder.SetIndent("", "  ")
	if err := encoder.Encode(activities); err != nil {
		fmt.Println("Error encoding JSON:", err)
		return
	}

	fmt.Println("Sample data generated and saved to sample_data.json")
}

Remember that this is just a basic example to get you started. Depending on your application's needs, you might want to generate more complex data with a wider range of possible actions, timestamps, and user profiles.

Contributing

Contributions are welcome! If you encounter any issues or want to add new features, feel free to open a pull request. For significant changes, please open an issue first to discuss your proposed changes.

License

This project is licensed under the GPL-3.0 License - see the LICENSE file for details.

Copyright 2023, Max Base

kafkadataprocessinggo's People

Contributors

dependabot[bot] avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.