Git Product home page Git Product logo

streaming-pipeline-on-gcp's Introduction

PubSub Streaming Architecture

image

publish.py

publish.py script publishes messages to a Google Cloud Pub/Sub topic

  • This code publishes messages to a Google Cloud Pub/Sub topic.
  • It imports necessary modules, sets up project and topic information, and authenticates using a service account.
  • It reads input data from a file and publishes each line of the file as a message to the Pub/Sub topic.
  • The publishing process is controlled by a loop, and a 1-second delay is added between each message publication.

process.py

process.py script uses Apache Beam to process data from a Google Cloud Pub/Sub subscription and write the processed data to another Pub/Sub topic. Then comes the Beam pipeline, it is reading data from PubSub using input_subscription and writing it to output_topic

  • This code uses Apache Beam to process data from a Google Cloud Pub/Sub subscription and write the processed data to another Pub/Sub topic.
  • It imports necessary modules from Apache Beam, sets up pipeline options and authentication using a service account.
  • It defines a pipeline that reads data from a Pub/Sub subscription, applies any necessary transformations, and writes the processed data to a Pub/Sub topic.
  • The pipeline is executed, and the code waits until the pipeline execution is finished.

subscribe.py

subscribe.py script creates a Google Cloud Pub/Sub subscriber and receives messages from a specified subscription

  • This code creates a Pub/Sub subscriber that continuously listens for messages from a specified subscription and processes them when received.
  • It imports necessary modules, sets up authentication using a service account, and specifies the subscription to listen to.
  • It defines a callback function that is called whenever a message is received. In this case, the function simply prints the received message.
  • The subscriber is set up to continuously listen for messages from the subscription using the callback function.
  • The script enters an infinite loop with a 60-second delay between iterations to keep the subscriber listening for new messages.

Summary

In summary, the project involves publishing messages to a Pub/Sub topic, processing those messages using Apache Beam, and subscribing to a different topic to consume and perform actions on the processed messages.

streaming-pipeline-on-gcp's People

Contributors

janaom avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.