Git Product home page Git Product logo

replicator's Introduction

MySQL Replicator

Replicates data changes from MySQL binlog to HBase or Kafka. In case of HBase, preserves the previous data versions. HBase storage is intended for auditing and analysis of historical data. In addition, special daily-changes tables can be maintained in HBase, which are convenient for fast and cheap imports from HBase to Hive. Replication to Kafka is intended for easy real-time access to a stream of data changes.

Documentation

This readme file provides some basic documentation on how to get started. For more details, refer to official documentation at mysql-time-machine.

Building required Docker images

  1. Run mvn package -P uberjar from the root of the replicator repository to build the MySQL Replicator jar that will be used later;
  2. Rename built jar to mysql-replicator.jar and copy it to the images/002_replicator_runner/input/replicator/ directory inside the docker repository;
  3. Run container_build.sh script from the images/002_replicator_runner/ directory inside the docker repository;
  4. Run docker ps to verify that replicator-runner image has been built successfully;

Getting Started with MySQL Replicator

Replicator assumes that there is a pre-installed environment in which it can run. This environment consists of:

  • MySQL Instance
  • Zookeeper Instance
  • Graphite Instance
  • Target Store Instance (Kafka, HBase, or none in case of STDOUT)

Easiest way to test drive the replicator is to use docker to locally create this needed environment. In addition to docker you will need docker-compose installed locally.

git clone https://github.com/mysql-time-machine/docker.git
cd docker/docker-compose/replicator_kafka

Start all containers (mysql, kafka, graphite, replicator, zookeeper)

  ./run_all

Now, in another terminal, you can connect to the replicator container

 ./attach_to_replicator
 cd /replicator

This folder contains the replicator jar, the replicator configuration file, log configuration and some utility scripts. Now we can insert some random data in mysql:

 ./random_mysql_ops
 ...
 ('TwIPn','4216871','313785','NIrnXGEpqJI gGDstvhs'),
 ('AwqgI','4831311','930233','IHwkTOuEnOqGdEWNzJtq'),
 ('WIJCB','1516599','487420','rPnOHfZlIvEEvFFEIGiW'),
 ...

This data has been inserted in pre-created database 'test' in precreated table 'sometable'. The provided mysql instance is configured to use RBR and binlogs are active.

  mysql --host=mysql --user=root --pass=mysqlPass
  
  mysql> use test;
  mysql> show tables;
  +----------------+
  | Tables_in_test |
  +----------------+
  | sometable      |
  +----------------+
  1 row in set (0.00 sec)

Now we can replicate the binlog content to Kafka.

 ./run_kafka

And read the data from Kafka

 ./read_kafka

In this example we have written rows to mysql, then replicated the binlogs to kafka and then red from Kafka sequentially. However, these processes can be run in parallel as the real life setup would work.

As the replication is running, you can observe the replication statistics at graphite dashboard: http://localhost/dashboard/

PACKAGING

Packaging the project is really simple with Maven

mvn clean package

Will generate a light jar with the project. This dependency can be used in your projects directly, always that you use maven to get all the transitive dependencies.

If you need to generate a Uberjar you can execute, so you don't need to worry about dependencies, you can execute

mvn clean package -P uberjar

The generated jar will contain all the dependencies included in a single jar file.

DEPLOY

To deploy a new version to Maven central it's enough executing

mvn clean deploy -P release

If previous step didn't work is probably because you don't have a SonaType account or a published GPG key. Follow these steps:

  1. Create a Sonatype Account
  2. Create a PGP Signature

Now you should be in conditions to deploy the project.

AUTHOR

Bosko Devetak [email protected]

CONTRIBUTORS

Carlos Tasada [ctasada]

Dmitrii Tcyganov [dtcyganov]

Evgeny Dmitriev [dmitrieveu]

Greg Franklin [gregf1]

Islam Hassan [ishassan]

Mikhail Dutikov [mikhaildutikov]

Muhammad Abbady [muhammad-abbady]

Philippe Bruhat (BooK) [book]

Pavel Salimov [chcat]

Pedro Silva [pedros]

Raynald Chung [raynald]

Rares Mirica [mrares]

ACKNOWLEDGMENT

Replicator was originally developed for Booking.com. With approval from Booking.com, the code and specification were generalized and published as Open Source on github, for which the author would like to express his gratitude.

COPYRIGHT AND LICENSE

Copyright (C) 2015, 2016, 2017, 2018 by Author and Contributors

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

replicator's People

Contributors

bdevetak avatar mikhaildutikov avatar chcat avatar ctasada avatar pedros avatar mrares avatar muhammad-abbady avatar ishassan avatar raynald avatar grzkv avatar dmitrieveu avatar gregf1 avatar book avatar fossabot avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.