Git Product home page Git Product logo

cd-stream's Introduction

CD-Stream

V1.0

CD-Stream is a cross-database CDC driven replicator tool that currently supports replication between MySQL and Postgres.

The Reason Why:

  • Timed Data extraction (Straight forward ETLs) using selects on a production database can be costly and intensive.
  • Cron jobs might have to be scheduled and what if they fail too?

What's New?

In the current version, the support is provided for replication from MySQL and loading the data onto Postgres and new . The loading jobs are queued in redis and processed automatically; thanks to rq workers.

Prerequisite:

Check if binary logging is enabled in your source database. Issue the following command in your source database to verify:

Mysql:

select variable_value as "BINARY LOGGING STATUS (log-bin) :: " from information_schema.global_variables where variable_name='log_bin';

If the above command returns "OFF", make sure that the following lines are added to the /etc/mysql/mysql.conf.d and restart the mysql service:

log_bin                 = mysql-bin
expire_logs_days        = 10
max_binlog_size         = 100M

All Set.. Time to Wrangle!!

Safety first - Put your hard hats on !

  1. Clone the project and Initialize a virtual environment.
$ git clone https://github.com/datawrangl3r/cd-stream.git
$ cd cd-stream
$ python3 -m venv .
$ source bin/activate
$ pip install -r requirements.txt
  1. Configure the streamsql.yml - Tailor it based on your needs
EXTRACTION:
    ENGINE: mysql
    HOST: localhost
    PORT: 3306
    USER: root
    PASS: password
    DB: SOURCEDB
COMMIT:
    ENGINE: postgres
    HOST: localhost
    PORT: 5432
    USER: postgres
    PASS: password
    DB: TARGETDB
QUEUE:
    ENGINE: REDIS
    HOST: localhost
  1. Initialize rq workers in the background:
$ rq worker &
  1. Start Replication and Data Load (Use Supervisor if needed)
$ python main.py &

cd-stream's People

Contributors

datawrangl3r avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.