Git Product home page Git Product logo

osalvador / replicadb Goto Github PK

View Code? Open in Web Editor NEW
381.0 25.0 95.0 38.42 MB

ReplicaDB is open source tool for database replication, designed for efficiently transferring bulk data between relational and non-relational databases

Home Page: https://osalvador.github.io/ReplicaDB/

License: Apache License 2.0

Java 98.83% Shell 0.36% Batchfile 0.22% Dockerfile 0.06% PLSQL 0.53%
java fast nosql-databases elt database-replication nosql cross-platform postgresql sql database

replicadb's Introduction

Dasper

Dasper is a Jekyll theme inspired by Ghost's default theme Casper and also Jasper & Kasper

Live demo

ozgrozer.com

Includes

  • Pagination
  • 404 page
  • Fastclick(@ftlabs)
  • Syntax highlighting(@PrismJS)
  • Sitemap generator(@kinnetica)
  • RSS generator(@agelber)
  • HTML compressor(a simple plugin written by me)
  • Addthis sharing buttons
  • Disquss comments
  • Google analytics tracking

Copyright & License

Copyright (c) 2016 - Released under the MIT License.

replicadb's People

Contributors

alexandra-zaharia avatar dependabot[bot] avatar francescotekapp avatar francescozanti avatar ilkerhalil avatar jako81624 avatar osalvador avatar qrollin avatar stalincalderon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

replicadb's Issues

sync multiple tables

suggest:
--source-table support multiple tables separate with comma
--source-table support * or regular expressions like [a-z].*
--source-table support another option file, which contains table names separate with comma like table1,table2,table3

Oracle to postgres data sync issue

We are trying to do a data dump from oracle to postgres. On some larger tables(>120GB) we are seeing some errors.
ERROR ReplicaTask:72 ERROR in TaskId-0 inserting data to sink table: ORA-01555: snapshot too old: rollback segment number 20 with name "_SYSSMU20_231399234$" too small
We tried a few different things like, do it when the data is not changing constantly, smaller fetch size, less number of jobs running but, nothing seemed to work so far.
Did you ever see anything similar happening?

Transfer Speed

Hello.

Is this a bug or it's the way it works.

I have setup ReplicaDB to transfer data between Oracle -> MySQL.

In the configuration file i have

jobs=1

for the fetch.size the following things occur:

When i setup

fetch.size=100 -> Transfer completes in 1 minute
fetch.size=1000 -> Transfer completes in 3 minutes
fetch.size=10000 -> Transfer completes in 26 minutes

Isn't supposed to by having bigger fetch.size, the time to transfer data gets smaller and not bigger like this?

Strange Oracle Date Bug

Describe the bug
Hello. I found a strange bug while trying to replicate a table that has a DATE column with specific values.
I tried it on two different oracle databases, versions. 9 & 11.

To Reproduce

Source table DDL:

CREATE TABLE MIKE_TEST_1
(
  NUM      NUMBER(20)                           NOT NULL,
  TRXTIME  DATE
)
TABLESPACE USERS
PCTUSED    0
PCTFREE    10
INITRANS   1
MAXTRANS   255
STORAGE    (
            INITIAL          128K
            NEXT             128K
            MINEXTENTS       1
            MAXEXTENTS       UNLIMITED
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOLOGGING 
NOCOMPRESS 
NOCACHE
NOPARALLEL
MONITORING;

Sink table DDL:

CREATE TABLE MIKE_TEST_2
(
  NUM      NUMBER(20)                           NOT NULL,
  TRXTIME  DATE
)
TABLESPACE USERS
PCTUSED    0
PCTFREE    10
INITRANS   1
MAXTRANS   255
STORAGE    (
            INITIAL          128K
            NEXT             128K
            MINEXTENTS       1
            MAXEXTENTS       UNLIMITED
            PCTINCREASE      0
            BUFFER_POOL      DEFAULT
           )
NOLOGGING 
NOCOMPRESS 
NOCACHE
NOPARALLEL
MONITORING;

ReplicaDB configuration options-file. :

######################## ReplicadB General Options ########################
mode=complete
jobs=1
fetch.size=100
verbose=true
############################# Soruce Options ##############################
source.connect=jdbc:oracle:thin:@192.168.1.10:1521:mike
source.user=username
source.password=password
source.query=SELECT * FROM MIKE_TEST_1
############################# Sink Options ################################
sink.connect=jdbc:oracle:thin:@192.168.1.10:1521:mike
sink.user=username
sink.password=password
sink.table=MIKE_TEST_2
sink.disable.truncate=false

Insert this on source table MIKE_TEST_1:

INSERT INTO MIKE_TEST_1 (NUM, TRXTIME)
VALUES (1, TO_DATE('1-1-1900 01:40:00', 'DD/MM/YYYY HH:MI:ss'));

Run the replicadb with the above config.

Check sink table MIKE_TEST_2, instead of having values

NUM	TRXTIME
1	1-1-1900 01:40:00

The values are

NUM	TRXTIME
1	1-1-1900 02:05:08

Expected behavior
Values at table MIKE_TEST_2 must be

NUM	TRXTIME
1	1-1-1900 01:40:00

Additional context

The strange thing is that if you insert at MIKE_TEST_1

INSERT INTO MIKE_TEST_1 (NUM, TRXTIME)
VALUES (1, TO_DATE('1-1-1901 01:40:00', 'DD/MM/YYYY HH:MI:ss'));

The values at MIKE_TEST_2 sink table after running replicadb are ok

MIKE_TEST_1

NUM	TRXTIME
1	1-1-1901 01:40:00

MIKE_TEST_2

NUM	TRXTIME
1	1-1-1901 01:40:00

So far i traced the following scenarios

Source Table: MIKE_TEST_1

NUM	TRXTIME
1	1-1-1900 01:45:00
2	1-1-1900 01:40:00
3	1-1-1900 01:38:00
4	1-1-1900 01:35:00
5	1-1-1900 01:30:00
6	1-1-1901 01:40:00

Sink Table: MIKE_TEST_2

NUM	TRXTIME
1	1-1-1900 02:10:08
2	1-1-1900 02:05:08
3	1-1-1900 02:03:08
4	1-1-1900 02:00:08
5	1-1-1900 01:30:00
6	1-1-1901 01:40:00

Oracle Replication Fails

Hi ,I am trying to replicate Oracle DB (source and destination ) both Oracle .Replication fails with below error .Not sure if this has something to do with class path ?

[root@B-Wallet-Lab ~]# replicadb --mode=complete -j=1 --source-connect=jdbc:oracle:thin:@10.0.0.137:1521:BPLORA --source-user=btapp --source-password=btapp --source-table=POSTPAID_MAST --sink-connect=jdbc:oracle:thin:@10.0.0.42:1521:BPLORA --source-user=mfs --source-password=mfs --sink-table=POSTPAID_MAST
2020-01-15 20:28:42,650 INFO ReplicaDB:43 Running ReplicaDB version: 0.6.0
2020-01-15 20:28:42,662 ERROR ReplicaDB:83 Got exception running ReplicaDB:
java.lang.RuntimeException: Could not load db driver class: oracle.jdbc.OracleDriver
at org.replicadb.manager.SqlManager.makeSinkConnection(SqlManager.java:237) ~[ReplicaDB-0.6.0.jar:0.6.0]
at org.replicadb.manager.SqlManager.getConnection(SqlManager.java:97) ~[ReplicaDB-0.6.0.jar:0.6.0]
at org.replicadb.manager.SqlManager.truncateTable(SqlManager.java:375) ~[ReplicaDB-0.6.0.jar:0.6.0]
at org.replicadb.manager.SqlManager.preSinkTasks(SqlManager.java:437) ~[ReplicaDB-0.6.0.jar:0.6.0]
at org.replicadb.ReplicaDB.main(ReplicaDB.java:60) [ReplicaDB-0.6.0.jar:0.6.0]
2020-01-15 20:28:42,666 INFO ReplicaDB:102 Total process time: 34ms

Oracle to Postgres for geom column fails

I am having issues dumping data from oracle to postgresql for geom columns. Here is the error:

ERROR ReplicaDB:83 Got exception running ReplicaDB:
java.util.concurrent.ExecutionException: java.lang.NoClassDefFoundError: oracle/xdb/XMLType
at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_232]
at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[?:1.8.0_232]
at org.replicadb.ReplicaDB.main(ReplicaDB.java:74) [ReplicaDB-0.5.0.jar:0.5.0]
Caused by: java.lang.NoClassDefFoundError: oracle/xdb/XMLType
at oracle.jdbc.driver.NamedTypeAccessor.getString(NamedTypeAccessor.java:351) ~[ojdbc8.jar:19.3.0.0.0]
at oracle.jdbc.driver.GeneratedStatement.getString(GeneratedStatement.java:289) ~[ojdbc8.jar:19.3.0.0.0]
at oracle.jdbc.driver.GeneratedScrollableResultSet.getString(GeneratedScrollableResultSet.java:376) ~[ojdbc8.jar:19.3.0.0.0]
at org.replicadb.manager.PostgresqlManager.insertDataToTable(PostgresqlManager.java:90) ~[ReplicaDB-0.5.0.jar:0.5.0]
at org.replicadb.ReplicaTask.call(ReplicaTask.java:70) ~[ReplicaDB-0.5.0.jar:0.5.0]
at org.replicadb.ReplicaTask.call(ReplicaTask.java:14) ~[ReplicaDB-0.5.0.jar:0.5.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_232]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_232]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_232]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_232]

I tried including the necessary libs but nothing worked. Thoughts?

Issue about Postgres uppercase

Hi,

The source (Oracle) table columns area: Id, Username, Email,...
In sink source (Postgres) are the same structure with table source.

Error:

INFO  PostgresqlManager:163 Copying data with this command: COPY users (Id,Fullname,Email,Phone,Address,...) FROM STDIN WITH DELIMITER e'\x1f' ENCODING 'UTF-8' 

ERROR ReplicaTask:74 ERROR in TaskId-0 inserting data to sink table: ERROR: column "id" of relation "users" does not exist
ERROR ReplicaDB:136 Got exception running ReplicaDB:
java.util.concurrent.ExecutionException: org.postgresql.util.PSQLException: ERROR: column "id" of relation "users" does not exist

How I can ignore uppercase columns name ?

Substr of certain columns from source database

Is there a way to copy the first 5 letters from column OAAA of table OTEST in an Oracle database, to column PAAA of table PTEST in a PostgreSQL database?
I've tried to complete it in conf without success:
............
source.query="SELECT SUBSTR(OAAA,1,5) FROM OTEST.OAAA"
............

p.s. Great tool, by the way. Really awesome.

Way to select all tables and replicate all to sink?

Pardon me if this already exists, and I haven't found a way to do it. But we have a large MSSQL database with many tables. I am able to do a query for all the tables, but how do you specify to replicate all the tables into the sink?
Thanks in advance.

Problem to execute replicadb

Hi guys, somebody already faced this problem?

C:\Program Files\ReplicaDB-0.10.5\bin>replicadb.cmd --help
The system cannot find the path specified.
Error: Environment variable bin not defined.
This is generally because this script should not be invoked directly. Use replicadb instead.
'-cp' is not recognized as an internal or external command,
operable program or batch file.

I tried use 'replicadb --help' and i got the same problem.

syntax error during COPY

Hello,

I'm using ReplicaDB to migrate SQL Server database to PostgreSQL.

I receive the following error:

2022-03-10 11:08:13,902 ERROR ReplicaTask:74 ERROR in TaskId-0 inserting data to sink table: ERROR: syntax error at or near "-"
Position: 45
2022-03-10 11:08:13,902 ERROR ReplicaDB:136 Got exception running ReplicaDB:
java.util.concurrent.ExecutionException: org.postgresql.util.PSQLException: ERROR: syntax error at or near "-"
Position: 45
at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:?]
at java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:?]
at org.replicadb.ReplicaDB.processReplica(ReplicaDB.java:117) [ReplicaDB-0.11.8.jar:0.11.8]
at org.replicadb.ReplicaDB.main(ReplicaDB.java:47) [ReplicaDB-0.11.8.jar:0.11.8]
Caused by: org.postgresql.util.PSQLException: ERROR: syntax error at or near "-"
Position: 45
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2675) ~[postgresql-42.3.3.jar:42.3.3]
at org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:1263) ~[postgresql-42.3.3.jar:42.3.3]
at org.postgresql.core.v3.QueryExecutorImpl.startCopy(QueryExecutorImpl.java:945) ~[postgresql-42.3.3.jar:42.3.3]
at org.postgresql.copy.CopyManager.copyIn(CopyManager.java:45) ~[postgresql-42.3.3.jar:42.3.3]
at org.replicadb.manager.PostgresqlManager.insertDataToTable(PostgresqlManager.java:65) ~[ReplicaDB-0.11.8.jar:0.11.8]
at org.replicadb.ReplicaTask.call(ReplicaTask.java:70) ~[ReplicaDB-0.11.8.jar:0.11.8]
at org.replicadb.ReplicaTask.call(ReplicaTask.java:14) ~[ReplicaDB-0.11.8.jar:0.11.8]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at java.lang.Thread.run(Thread.java:829) ~[?:?]

Do you think it's because some rows contains the - character in the source tables ?

Thanks in advance

Replication not working for geometric type column

I have SDO_GEOMETRY column type that I am trying to replicate between oracle schemas and I see the following error.
ERROR ReplicaTask:72 ERROR in TaskId-3 inserting data to sink table: ORA-00932: inconsistent datatypes: expected MDSYS.SDO_GEOMETRY got CHAR

ERROR ReplicaDB:101 Got exception running ReplicaDB:
java.util.concurrent.ExecutionException: java.sql.BatchUpdateException: ORA-00932: inconsistent datatypes: expected MDSYS.SDO_GEOMETRY got CHAR
I checked data and all of them are SDO_GEOMETRY type. I even tried with 1 row and its the same. Have you seen this before?

Support for MySQL Bit-Value Type - BIT

When Replicating a table between MySQL servers the value for the Bit type gets converted from a zero to a one.
Server1
mysql radman -e "select id,name,description,cast(sensitive_data as UNSIGNED) as sensitive_data from radcheck_attribute"
+----+-----------+-------------+----------------+
| id | name | description | sensitive_data |
+----+-----------+-------------+----------------+
| 2 | Auth-Type | | 0 |
+----+-----------+-------------+----------------+

Server2
mysql radman -e "select id,name,description,cast(sensitive_data as UNSIGNED) as sensitive_data from radcheck_attribute"
+----+-----------+-------------+----------------+
| id | name | description | sensitive_data |
+----+-----------+-------------+----------------+
| 2 | Auth-Type | | 1 |
+----+-----------+-------------+----------------+

CREATE TABLE radcheck_attribute (
id bigint(20) NOT NULL AUTO_INCREMENT,
name varchar(64) NOT NULL,
description varchar(255) DEFAULT NULL,
sensitive_data bit(1) NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY uk_radcheck_attribute_name (name)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=latin1

Oracle 9i Problem ORA_HASH Function

Hello.
We want to sync tables between Oracle 9i and MySQL.

Although the connection is happening, there is an error for the Oracle 9i

2022-02-24 14:48:16,902 ERROR ReplicaTask:65 ERROR in TaskId-3 reading source table: ORA-00904: "ORA_HASH": invalid identifier

I think Oracle 9i does not have ORA_HASH function.

https://www.ibm.com/support/pages/using-rowid-hash-method-oracle-partitioned-read-failed-due-ora-00904

But ORA_HASH is a new Oracle 10g function, if the Oracle database is 8i or 9i, it will report ORA-00904 error because it did not exist. Here are some links from Oracle website:

http://asktom.oracle.com/pls/asktom/f?p=100:11:0::::P11_QUESTION_ID:26043356526642
http://docs.oracle.com/cd/B13789_01/server.101/b10759/functions097.htm

Is there a way to bypass this hashing function and just select the data as is without setting up Fetching Size? We really want to make it work.

Thank you.

MySql Verbose - 0 rows processed

Describe the bug
Hello.

I noticed when i use replicadb from source mysql to sink oracle, csv etc, with verbose enabled, the message about the rows processed is always 0 although the replication works just fine. If i try from source oracle to sink mysql, csv, etc, the info message shows exactly how many rows processed.

From Mysql:

INFO  ReplicaTask:72 A total of 0 rows processed by task 0

From Oracle:

INFO  ReplicaTask:72 A total of 292 rows processed by task 0

To Reproduce
Steps to reproduce the behavior:

  1. Source Mysql Table
  2. Sink Oracle, CSV etc.

Expected behavior
The INFO message should show how many rows have been processed when the source is MySql

Additional context
Source MySql version 8.0.29

Multiple Tables - staging table error

Hi,

I'm trying to sync multiple tables in incremental mode. It returns a error in creating and dropping the staging table. I'm working on a test setup where I have 2 PostgreSQL databases running in docker with replicaDB.
There are no issues when it runs for single table. The replication fails only for multiple tables.

The configuration file

######################## ReplicadB General Options ########################
mode=incremental
jobs=1
fetch.size=100
verbose=true
############################# Soruce Options ##############################
source.connect=jdbc:postgresql://postgres1:5433/testing
source.user=pguser
source.password=pgpass
source.table=public.orders, public.categories

############################# Sink Options ################################
sink.connect=jdbc:postgresql://postgres2:5434/testing
sink.user=pguser
sink.password=pgpass
sink.table=public.orders, public.categories

Error Log

replicadb_1  | 2020-11-29 19:49:39,280 INFO  ReplicaDB:42 Running ReplicaDB version: 0.8.9
replicadb_1  | 2020-11-29 19:49:39,282 INFO  ReplicaDB:46 Setting verbose mode
replicadb_1  | 2020-11-29 19:49:39,282 DEBUG ReplicaDB:47 ToolOptions{
replicadb_1  |  sourceConnect='jdbc:postgresql://postgres1:5433/testing',
replicadb_1  |  sourceUser='pguser',
replicadb_1  |  sourcePassword='****',
replicadb_1  |  sourceTable='public.orders,public.categories',
replicadb_1  |  sourceColumns='null',
replicadb_1  |  sourceWhere='null',
replicadb_1  |  sourceQuery='null',
replicadb_1  |  sinkConnect='jdbc:postgresql://postgres2:5434/testing',
replicadb_1  |  sinkUser='pguser',
replicadb_1  |  sinkPassword='****',
replicadb_1  |  sinkTable='public.orders,public.categories',
replicadb_1  |  sinkStagingTable='null',
replicadb_1  |  sinkStagingSchema='null',
replicadb_1  |  sinkStagingTableAlias='null',
replicadb_1  |  sinkColumns='null',
replicadb_1  |  sinkDisableEscape=false,
replicadb_1  |  sinkDisableIndex=false,
replicadb_1  |  sinkDisableTruncate=false,
replicadb_1  |  sinkAnalyze=false,
replicadb_1  |  jobs=1,
replicadb_1  |  bandwidthThrottling=0,
replicadb_1  |  quotedIdentifiers=false,
replicadb_1  |  fetchSize=100,
replicadb_1  |  help=false,
replicadb_1  |  version=false,
replicadb_1  |  verbose=true,
replicadb_1  |  optionsFile='/home/replicadb/conf/replicadb.conf',
replicadb_1  |  mode='incremental',
replicadb_1  |  sourceConnectionParams={},
replicadb_1  |  sinkConnectionParams={}}
replicadb_1  | 2020-11-29 19:49:39,288 DEBUG ManagerFactory:41 Trying with scheme: jdbc:postgresql:
replicadb_1  | 2020-11-29 19:49:39,291 DEBUG ManagerFactory:41 Trying with scheme: jdbc:postgresql:
replicadb_1  | 2020-11-29 19:49:39,291 WARN  SqlManager:495 No staging schema is defined, setting it as PUBLIC     
replicadb_1  | 2020-11-29 19:49:39,356 DEBUG SqlManager:269 No connection parameters specified. Using regular API for making connection.
replicadb_1  | 2020-11-29 19:49:39,476 INFO  PostgresqlManager:217 Creating staging table with this command: CREATE UNLOGGED TABLE IF NOT EXISTS public.categoriesrepdb3810 ( LIKE public.orders,public.categories INCLUDING DEFAULTS 
INCLUDING CONSTRAINTS ) WITH (autovacuum_enabled=false)
postgres2_1  | 2020-11-29 19:49:39.478 UTC [224] ERROR:  syntax error at or near "." at character 91
postgres2_1  | 2020-11-29 19:49:39.478 UTC [224] STATEMENT:  CREATE UNLOGGED TABLE IF NOT EXISTS public.categoriesrepdb3810 ( LIKE public.orders,public.categories INCLUDING DEFAULTS INCLUDING CONSTRAINTS ) WITH (autovacuum_enabled=false)
replicadb_1  | 2020-11-29 19:49:39,482 ERROR ReplicaDB:101 Got exception running ReplicaDB:
replicadb_1  | org.postgresql.util.PSQLException: ERROR: syntax error at or near "."
replicadb_1  |   Position: 91
replicadb_1  |  at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2422) ~[postgresql-42.2.1.jar:42.2.1]
replicadb_1  |  at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2167) ~[postgresql-42.2.1.jar:42.2.1]
replicadb_1  |  at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:306) ~[postgresql-42.2.1.jar:42.2.1]
replicadb_1  |  at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441) ~[postgresql-42.2.1.jar:42.2.1]
replicadb_1  |  at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365) ~[postgresql-42.2.1.jar:42.2.1]   
replicadb_1  |  at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:307) ~[postgresql-42.2.1.jar:42.2.1]
replicadb_1  |  at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:293) ~[postgresql-42.2.1.jar:42.2.1]
replicadb_1  |  at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:270) ~[postgresql-42.2.1.jar:42.2.1]
replicadb_1  |  at org.postgresql.jdbc.PgStatement.executeUpdate(PgStatement.java:244) ~[postgresql-42.2.1.jar:42.2.1]
replicadb_1  |  at org.replicadb.manager.PostgresqlManager.createStagingTable(PostgresqlManager.java:218) ~[ReplicaDB-0.8.9.jar:0.8.9]
replicadb_1  |  at org.replicadb.manager.SqlManager.preSinkTasks(SqlManager.java:498) ~[ReplicaDB-0.8.9.jar:0.8.9] 
replicadb_1  |  at org.replicadb.ReplicaDB.main(ReplicaDB.java:62) [ReplicaDB-0.8.9.jar:0.8.9]
replicadb_1  | 2020-11-29 19:49:39,486 INFO  SqlManager:475 Dropping staging table with this command: DROP TABLE public.categoriesrepdb3810
postgres2_1  | 2020-11-29 19:49:39.486 UTC [224] ERROR:  table "categoriesrepdb3810" does not exist
postgres2_1  | 2020-11-29 19:49:39.486 UTC [224] STATEMENT:  DROP TABLE public.categoriesrepdb3810
replicadb_1  | 2020-11-29 19:49:39,487 ERROR ReplicaDB:119 org.postgresql.util.PSQLException: ERROR: table "categoriesrepdb3810" does not exist
replicadb_1  | 2020-11-29 19:49:39,488 INFO  ReplicaDB:124 Total process time: 220ms

Oracle Greek Characters Problem

Hello.

We try to migrate data from an Oracle table to MySQL.

In Oracle the NLS_CHARACTERSET is EL8ISO8859P7

select * from nls_database_parameters where parameter='NLS_CHARACTERSET';

The problem is, the connections and the data transfer happens BUT in the MySQL table the data are empty, only fields with numbers are showing.

I think it has something to do with the character encoding.

Is there a way to force the jdbc:oracle from Oracle source to use EL8ISO8859P7 NLS_CHARACTERSET?

I tried something like this

source.connect=jdbc:oracle:thin:@::?characterEncoding=EL8ISO8859P7 OR ?characterEncoding=utf8

but the same result. Blank fields.

Cannot connect to MySQL on docker

Connection to MySQL database is always refused.

After the running docker compose up starting up MySQL and replicadb defined in the compose file I then run the command below in another terminal


$ read -r -d '' REPLICADB_CONFIGURATION << EOM
source.connect=jdbc:mysql://mysqldb/osalvador
source.username=root
source.password=password
sink.connect=jdbc:mysql://mysqldb_replica/osalvador
sink.username=root
sink.password=password
mode=complete
EOM

$ docker run \
    -e "REPLICADB_CONFIGURATION=${REPLICADB_CONFIGURATION}" \
    osalvador/replicadb

....but connection is always refused.

Image for Container

Hi @osalvador ,

I really appreciate your work regarding ReplicaDB.

Could I suggest (and eventually help to mantain) an image for container (Podman / Docker / etc) of your application?

I think It may be a nice feature easy to implement, but It coudl help people to deploy fast your solution.

What do you think?

Replicate mediumblob fields from MySQL

My source mysql database have a couple of mediumblob fileds. How can replicate them to my sink (also Mysql) database?
As I'm getting this error:

2022-01-12 15:30:36,940 ERROR ReplicaTask:74 ERROR in TaskId-0 inserting data to sink table: (conn=77) Invalid utf8 character string: 'GSI���n%G"/�eHC�9����TQ�sC�����p?'
2022-01-12 15:30:36,941 ERROR ReplicaDB:136 Got exception running ReplicaDB:
java.util.concurrent.ExecutionException: java.sql.SQLTransientConnectionException: (conn=77) Invalid utf8 character string: 'GSI���n%G"/�eHC�9����TQ�sC�����p?'
        at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_212]
        at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[?:1.8.0_212]
        at org.replicadb.ReplicaDB.processReplica(ReplicaDB.java:117) [ReplicaDB-0.11.4.jar:0.11.4]
        at org.replicadb.ReplicaDB.main(ReplicaDB.java:47) [ReplicaDB-0.11.4.jar:0.11.4]
Caused by: java.sql.SQLTransientConnectionException: (conn=77) Invalid utf8 character string: 'GSI���n%G"/�eHC�9����TQ�sC�����p?'
        at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.createException(ExceptionFactory.java:79) ~[mariadb-java-client-2.7.3.jar:?]
        at org.mariadb.jdbc.internal.util.exceptions.ExceptionFactory.create(ExceptionFactory.java:155) ~[mariadb-java-client-2.7.3.jar:?]
        at org.mariadb.jdbc.MariaDbStatement.executeExceptionEpilogue(MariaDbStatement.java:274) ~[mariadb-java-client-2.7.3.jar:?]
        at org.mariadb.jdbc.MariaDbStatement.executeInternal(MariaDbStatement.java:363) ~[mariadb-java-client-2.7.3.jar:?]
        at org.mariadb.jdbc.MariaDbStatement.executeUpdate(MariaDbStatement.java:627) ~[mariadb-java-client-2.7.3.jar:?]
        at org.replicadb.manager.MySQLManager.insertDataToTable(MySQLManager.java:154) ~[ReplicaDB-0.11.4.jar:0.11.4]
        at org.replicadb.ReplicaTask.call(ReplicaTask.java:70) ~[ReplicaDB-0.11.4.jar:0.11.4]
        at org.replicadb.ReplicaTask.call(ReplicaTask.java:14) ~[ReplicaDB-0.11.4.jar:0.11.4]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_212]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_212]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_212]
        at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_212]
Caused by: org.mariadb.jdbc.internal.util.exceptions.MariaDbSqlException: Invalid utf8 character string: 'GSI���n%G"/�eHC�9����TQ�sC�����p?'
        at org.mariadb.jdbc.internal.util.exceptions.MariaDbSqlException.of(MariaDbSqlException.java:34) ~[mariadb-java-client-2.7.3.jar:?]
        at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.exceptionWithQuery(AbstractQueryProtocol.java:194) ~[mariadb-java-client-2.7.3.jar:?]
        at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.executeQuery(AbstractQueryProtocol.java:262) ~[mariadb-java-client-2.7.3.jar:?]
        at org.mariadb.jdbc.MariaDbStatement.executeInternal(MariaDbStatement.java:357) ~[mariadb-java-client-2.7.3.jar:?]
        at org.mariadb.jdbc.MariaDbStatement.executeUpdate(MariaDbStatement.java:627) ~[mariadb-java-client-2.7.3.jar:?]
        at org.replicadb.manager.MySQLManager.insertDataToTable(MySQLManager.java:154) ~[ReplicaDB-0.11.4.jar:0.11.4]
        at org.replicadb.ReplicaTask.call(ReplicaTask.java:70) ~[ReplicaDB-0.11.4.jar:0.11.4]
        at org.replicadb.ReplicaTask.call(ReplicaTask.java:14) ~[ReplicaDB-0.11.4.jar:0.11.4]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_212]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_212]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_212]
        at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_212]
Caused by: java.sql.SQLException: Invalid utf8 character string: 'GSI���n%G"/�eHC�9����TQ�sC�����p?'
        at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.readErrorPacket(AbstractQueryProtocol.java:1694) ~[mariadb-java-client-2.7.3.jar:?]
        at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.readPacket(AbstractQueryProtocol.java:1556) ~[mariadb-java-client-2.7.3.jar:?]
        at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.getResult(AbstractQueryProtocol.java:1519) ~[mariadb-java-client-2.7.3.jar:?]
        at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.readLocalInfilePacket(AbstractQueryProtocol.java:1798) ~[mariadb-java-client-2.7.3.jar:?]
        at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.readPacket(AbstractQueryProtocol.java:1562) ~[mariadb-java-client-2.7.3.jar:?]
        at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.getResult(AbstractQueryProtocol.java:1519) ~[mariadb-java-client-2.7.3.jar:?]
        at org.mariadb.jdbc.internal.protocol.AbstractQueryProtocol.executeQuery(AbstractQueryProtocol.java:256) ~[mariadb-java-client-2.7.3.jar:?]
        at org.mariadb.jdbc.MariaDbStatement.executeInternal(MariaDbStatement.java:357) ~[mariadb-java-client-2.7.3.jar:?]
        at org.mariadb.jdbc.MariaDbStatement.executeUpdate(MariaDbStatement.java:627) ~[mariadb-java-client-2.7.3.jar:?]
        at org.replicadb.manager.MySQLManager.insertDataToTable(MySQLManager.java:154) ~[ReplicaDB-0.11.4.jar:0.11.4]
        at org.replicadb.ReplicaTask.call(ReplicaTask.java:70) ~[ReplicaDB-0.11.4.jar:0.11.4]
        at org.replicadb.ReplicaTask.call(ReplicaTask.java:14) ~[ReplicaDB-0.11.4.jar:0.11.4]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_212]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_212]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_212]
        at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_212]

Originally posted by @pablotrin in #55

exception loading into sqlserver

2022-08-18 00:21:16,246 INFO ReplicaDB:63 Running ReplicaDB version: 0.12.1
2022-08-18 00:21:16,251 INFO ReplicaDB:67 Setting verbose mode
2022-08-18 00:21:16,251 DEBUG ReplicaDB:68 ToolOptions{
sourceConnect='file://home/zhaolih/mcs/mcs_uat.csv',
sourceUser='null',
sourcePassword='null',
sourceTable='null',
sourceColumns='dwelling_id, dwelling_type, civic_no, civic_suffix, unit_id, road_name, road_name, road_type, road_type_key, road_dir, road_dir_key, municipality_name, municipality_name_key, postal_code, prov_abvn, prov_code, csd_uid, csd_municipality_name, block_no, cu_id, is_civic_address',
sourceWhere='null',
sourceQuery='null',
sinkConnect='jdbc:sqlserver://itars-as-dev01:1433;database=MCS;encrypt=false',
sinkUser='mcs',
sinkPassword='****',
sinkTable='GEOSEARCH.VW_ADDRESS_DATA',
sinkStagingTable='null',
sinkStagingSchema='null',
sinkStagingTableAlias='null',
sinkColumns='dwelling_id, dwelling_type, civic_no, civic_suffix, unit_id , road_name, road_name_key, road_type, road_type_key, road_dir, road_dir_key, municipality_name, municipality_name_key, postal_code , prov_abvn, prov_code, csd_uid, csd_municipality_name, block_no, cu_id, is_civic_address',
sinkDisableEscape=false,
sinkDisableIndex=false,
sinkDisableTruncate=false,
sinkAnalyze=false,
jobs=1,
BandwidthThrottling=0,
quotedIdentifiers=false,
fetchSize=100,
help=false,
version=false,
verbose=true,
optionsFile='csv_sqlserver.conf',
mode='complete',
sentryDsn='null',
sourceConnectionParams={columns.types=NTEGER, SMALLINT, INTEGER, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR, VARCHAR, INTEGER, VARCHAR, VARCHAR, INTEGER, TINYINT, format.delimiter=:, format=DEFAULT},
sinkConnectionParams={domain=STC},
sourceFileFormat='null',
sinkFileformat='null'}
2022-08-18 00:21:16,337 WARN FileManagerFactory:39 The file format is not defined, setting CSV as the default file format.
2022-08-18 00:21:16,428 INFO SqlManager:370 Truncating sink table with this command: TRUNCATE TABLE GEOSEARCH.VW_ADDRESS_DATA
2022-08-18 00:21:17,184 INFO ReplicaTask:36 Starting TaskId-0
2022-08-18 00:21:17,184 WARN FileManagerFactory:39 The file format is not defined, setting CSV as the default file format.
2022-08-18 00:21:17,230 DEBUG CsvFileManager:99 Setting the initial format to DEFAULT
2022-08-18 00:21:17,230 INFO CsvFileManager:198 The final CSVFormat is: Delimiter=<:> QuoteChar=<"> RecordSeparator=<

EmptyLines:ignored SkipHeaderRecord:false
2022-08-18 00:21:17,287 ERROR ReplicaTask:74 ERROR in TaskId-0 inserting data to sink table: The argument sourceColumn is not valid.
2022-08-18 00:21:17,288 ERROR ReplicaDB:136 Got exception running ReplicaDB:
java.util.concurrent.ExecutionException: com.microsoft.sqlserver.jdbc.SQLServerException: The argument sourceColumn is not valid.
at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:?]
at java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:?]
at org.replicadb.ReplicaDB.processReplica(ReplicaDB.java:117) [ReplicaDB-0.12.1.jar:0.12.1]
at org.replicadb.ReplicaDB.main(ReplicaDB.java:47) [ReplicaDB-0.12.1.jar:0.12.1]
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: The argument sourceColumn is not valid.
at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDriverError(SQLServerException.java:237) ~[mssql-jdbc-11.2.0.jre11.jar:?]
at com.microsoft.sqlserver.jdbc.SQLServerBulkCopy.throwInvalidArgument(SQLServerBulkCopy.java:1638) ~[mssql-jdbc-11.2.0.jre11.jar:?]
at com.microsoft.sqlserver.jdbc.SQLServerBulkCopy.addColumnMapping(SQLServerBulkCopy.java:429) ~[mssql-jdbc-11.2.0.jre11.jar:?]
at org.replicadb.manager.SQLServerManager.insertDataToTable(SQLServerManager.java:83) ~[ReplicaDB-0.12.1.jar:0.12.1]
at org.replicadb.ReplicaTask.call(ReplicaTask.java:70) ~[ReplicaDB-0.12.1.jar:0.12.1]
at org.replicadb.ReplicaTask.call(ReplicaTask.java:14) ~[ReplicaDB-0.12.1.jar:0.12.1]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at java.lang.Thread.run(Thread.java:829) ~[?:?]
csv_sqlserver.conf.zip

Empty `WHEN MATCHED THEN UPDATE SET` clause during `MERGE INTO`

Describe the bug
I am trying to replicate in incremental mode an association table where the PRIMARY KEY is a combination of all the columns of the table (only 2). In that case, the generated MERGE INTO query use an empty WHEN MATCHED THEN UPDATE SET clause

Logs :

2022-09-14 11:07:11,435 INFO  SqlManager:315 Getting PKs for schema: null and table: trips_delivery_numbers. Found.
2022-09-14 11:07:11,436 INFO  SQLServerManager:194 Merging staging table and sink table with this command: MERGE INTO trips_delivery_numbers trg USING (SELECT trip_id,delivery_number FROM dbo.trips_delivery_numbersrepdb8710 ) src ON  (src.trip_id= trg.trip_id AND src.delivery_number= trg.delivery_number ) WHEN MATCHED THEN UPDATE SET WHEN NOT MATCHED THEN INSERT ( trip_id,delivery_number ) VALUES ( src.trip_id , src.delivery_number  ); 
2022-09-14 11:07:11,482 ERROR ReplicaDB:136 Got exception running ReplicaDB:
com.microsoft.sqlserver.jdbc.SQLServerException: Incorrect syntax near the keyword 'WHEN'.
        at com.microsoft.sqlserver.jdbc.SQLServerException.makeFromDatabaseError(SQLServerException.java:262) ~[mssql-jdbc-7.2.2.jre8.jar:?]
        at com.microsoft.sqlserver.jdbc.SQLServerStatement.getNextResult(SQLServerStatement.java:1621) ~[mssql-jdbc-7.2.2.jre8.jar:?]
        at com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQLServerStatement.java:868) ~[mssql-jdbc-7.2.2.jre8.jar:?]
        at com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute(SQLServerStatement.java:768) ~[mssql-jdbc-7.2.2.jre8.jar:?]
        at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7194) ~[mssql-jdbc-7.2.2.jre8.jar:?]
        at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:2935) ~[mssql-jdbc-7.2.2.jre8.jar:?]
        at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:248) ~[mssql-jdbc-7.2.2.jre8.jar:?]
        at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:223) ~[mssql-jdbc-7.2.2.jre8.jar:?]
        at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeUpdate(SQLServerStatement.java:711) ~[mssql-jdbc-7.2.2.jre8.jar:?]
        at org.replicadb.manager.SQLServerManager.mergeStagingTable(SQLServerManager.java:195) ~[ReplicaDB-0.12.3.jar:0.12.3]
        at org.replicadb.manager.SqlManager.postSinkTasks(SqlManager.java:505) ~[ReplicaDB-0.12.3.jar:0.12.3]
        at org.replicadb.manager.SQLServerManager.postSinkTasks(SQLServerManager.java:261) ~[ReplicaDB-0.12.3.jar:0.12.3]
        at org.replicadb.ReplicaDB.processReplica(ReplicaDB.java:128) [ReplicaDB-0.12.3.jar:0.12.3]
        at org.replicadb.ReplicaDB.main(ReplicaDB.java:47) [ReplicaDB-0.12.3.jar:0.12.3]

To Reproduce
Options-file :

############################# Common Options ##############################
mode=incremental
jobs=4
fetch.size=500
bandwidth.throttling=0
verbose=false
############################# source Options ##############################
source.connect=**********
source.user=**********
source.password=**********
source.table=trips_delivery_numbers
source.where=trip_id IN (SELECT id FROM trips WHERE recorded_at >= '2022-08-01 00:00:00' AND recorded_at < '2022-08-08 00:00:00') AND
############################# sink Options ##############################
sink.connect=**********
sink.user=**********
sink.password=**********
sink.staging.schema=dbo

Expected behavior
When there is no column to update, there should be no WHEN MATCHED THEN UPDATE

Additional context
Both the source and sink databases are MSSQL

Report an understandable error when the DB is not supported

I actually went to read the code to find that mysql is not supported, the:

java.lang.NullPointerException: null
	at org.replicadb.ReplicaDB.main(ReplicaDB.java:61) [ReplicaDB-0.8.9.jar:0.8.9]

is not useful. Note that I did check in advance that the mysql connector was NOT on the lib directory within the docker image and mounted it myself.

At the end it is my own fault for not reading the README.md introduction properly where it states clearly what is actually supported.

Originally posted by @aaronps in #10 (comment)

Out of memory error

Hello.

I try to run a replicadb sync with fetchSize=100000 and get a java error

ERROR ReplicaDB:136 Got exception running ReplicaDB: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Java heap space at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_332] at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[?:1.8.0_332] at org.replicadb.ReplicaDB.processReplica(ReplicaDB.java:117) [ReplicaDB-0.11.8.jar:0.11.8] at org.replicadb.ReplicaDB.main(ReplicaDB.java:47) [ReplicaDB-0.11.8.jar:0.11.8] Caused by: java.lang.OutOfMemoryError: Java heap space at java.lang.StringCoding.encode(StringCoding.java:350) ~[?:1.8.0_332] at java.lang.String.getBytes(String.java:941) ~[?:1.8.0_332] at org.replicadb.manager.MySQLManager.insertDataToTable(MySQLManager.java:139) ~[ReplicaDB-0.11.8.jar:0.11.8] at org.replicadb.ReplicaTask.call(ReplicaTask.java:70) ~[ReplicaDB-0.11.8.jar:0.11.8] at org.replicadb.ReplicaTask.call(ReplicaTask.java:14) ~[ReplicaDB-0.11.8.jar:0.11.8] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_332] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_332] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_332] at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_332] 2022-07-04 16:15:10,397 INFO ReplicaDB:54 Total process time: 338720ms

From the htop i saw the -Xms and -Xmx of the jre was 256mb

How can i increase the java heap size while running the replicadb?

Replication on Azure SQL Database

I am trying to replicate data across azure SQL databases (source and sink = Azure SQL). Getting below error while trying to run the command (mode - incremental ). created a staging schema named staging and is part of the execution command
logs.
Source table name = location schema = core
Sink table name = location schema =core_replica

2020-09-25 11:38:17,259 INFO ReplicaDB:42 Running ReplicaDB version: 0.8.7
2020-09-25 11:38:18,477 INFO SQLServerManager:129 Creating staging table with this command: SELECT * INTO staging.locationrepdb3310 FROM core_replica.location WHERE 0 = 1
2020-09-25 11:38:18,566 INFO SqlManager:386 Truncating sink table with this command: TRUNCATE TABLE staging.locationrepdb3310
2020-09-25 11:38:18,650 INFO ReplicaTask:36 Starting TaskId-0

2020-09-25 11:38:19,793 INFO SQLServerManager:94 Perfoming BulkCopy into staging.locationrepdb3310
2020-09-25 11:38:20,037 ERROR ReplicaTask:72 ERROR in TaskId-0 inserting data to sink table: The service has encountered an error processing your request. Please try again. Error code 4815.
2020-09-25 11:38:20,038 ERROR ReplicaDB:101 Got exception running ReplicaDB:
java.util.concurrent.ExecutionException: com.microsoft.sqlserver.jdbc.SQLServerException: The service has encountered an error processing your request. Please try again. Error code 4815.
at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:?]
at java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:?]

  1. Is Azure SQL supported for replication . I guess it should be as the drivers for SQL server and Azure SQL is the same
  2. Is there a problem specifically due to bulk copy?

deleted records aren't getting deleted from the sink tables

deleted records aren't getting deleted from the sink tables. Oracle to MS SQL Server. output is normal without and warning or error. Update is working though.

Version - 0.8.9

sourceConnect='jdbc:oracle:thin:@localhost:1521:TEST',
sourceUser='sys as sysdba',
sourcePassword='',
sourceTable='sys.employee',
sourceColumns='empname,empid',
sourceWhere='null',
sourceQuery='null',
sinkConnect='jdbc:sqlserver://check.database.windows.net:1433;database=rep',
sinkUser='dbuser',
sinkPassword='',
sinkTable='dbo.employee',
sinkStagingTable='null',
sinkStagingSchema='dbo',
sinkStagingTableAlias='null',
sinkColumns='empname,empid',
sinkDisableEscape=false,
sinkDisableIndex=false,
sinkDisableTruncate=false,
sinkAnalyze=false,
jobs=1,
bandwidthThrottling=0,
quotedIdentifiers=false,
fetchSize=100,
help=false,
version=false,
verbose=true,
optionsFile='employee.conf',
mode='incremental',
sourceConnectionParams={source.connect.parameter.defaultRowPrefetch=5000, source.connect.parameter.oracle.net.tns_admin=${TNS_ADMIN}, source.connect.parameter.oracle.net.networkCompression=on},
sinkConnectionParams={}}

NULL values get converted to zero when replicating mysql table

I'm having an issue in which it appears that NULL values are not being copied properly when replicating a table. This is my replicadb.conf:

######################## ReplicadB General Options ########################
mode=complete
jobs=1
fetch.size=100
verbose=true
############################# Source Options ##############################
source.connect=jdbc:mysql://source.example.com:3306/test
source.user=
source.password=
source.table=nodes
source.columns=id,label,deleted_at,is_required
############################# Sink Options ################################
sink.connect=jdbc:mysql://sink.example.com:3306/test
sink.user=
sink.password=
sink.table=nodes
sink.columns=id,label,deleted_at,is_required

The source and sink are both MySQL 8 servers with identical database structure. Only the data differs. Perhaps of interest, sql_mode is set to "".

This is the database structure:

CREATE TABLE nodes (
  id int(10) UNSIGNED NOT NULL,
  label varchar(255) CHARACTER SET utf8 COLLATE utf8_unicode_ci DEFAULT NULL,
  deleted_at int(11) DEFAULT NULL,
  is_required tinyint(1) NOT NULL DEFAULT '0'
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

ALTER TABLE nodes
  ADD PRIMARY KEY (id),
  ADD KEY deleted_at (deleted_at);

ALTER TABLE nodes
  MODIFY id int(10) UNSIGNED NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=11;

This is the data present at the source table:

INSERT INTO nodes (id, label, deleted_at, is_required) VALUES
(1, 'Was the banana straight?', NULL, 0),
(2, 'Are we living in a simulation?', NULL, 1),
(3, 'Was the product there on entry?', NULL, 0),
(4, 'Was the clown wearing his pinstripe suit?', NULL, 0),
(5, 'Was the walmart greeter wearing a top hat?', 1, 0),
(6, 'Is this a test  (yes)?', 1, 1),
(7, 'Is everything awesome?', NULL, 0),
(8, 'Is the product there on entry?', 3, 0),
(9, 'Is the product a thing?', 3, 0),
(10, 'Is the price tag there?', 3, 0);

After replication, the data present at the sink table is as follows:

INSERT INTO nodes (id, label, deleted_at, is_required) VALUES
(1, 'Was the banana straight?', 0, 0),
(2, 'Are we living in a simulation?', 0, 1),
(3, 'Was the product there on entry?', 0, 0),
(4, 'Was the clown wearing his pinstripe suit?', 0, 0),
(5, 'Was the walmart greeter wearing a top hat?', 1, 0),
(6, 'Is this a test  (yes)?', 1, 1),
(7, 'Is everything awesome?', 0, 0),
(8, 'Is the product there on entry?', 3, 0),
(9, 'Is the product a thing?', 3, 0),
(10, 'Is the price tag there?', 3, 0);

As you can see, all NULL values are turned to zeros.

I have played around with various nullString and quoteMode values with no luck. Was wondering if I am simply doing something wrong, or that this is a known issue?

Convert MSSQL geometry type before bulkCopy

I am trying to replicate a table containing a column using geometry type.
I know that this type is not handled by bulkCopy so I am trying to convert this type like shown in the documentation here

But in order to avoid data loss, I would like to split this column into 2, one VARBINARY(255) column for the data, one Integer column for the SRID. To do so, I am using a syntax like

source.columns=id, date, location.STAsBinary() as location_binary, location.STSrid as location_srid
sink.columns=id, date, location_binary, location_srid

My understanding being each source column needs to match a sink column, those being use directly as named target during the insert. But it does not work, and the error I get is not even relevant the conversion :

Column date in invalid. Check your columns mapping.

Is there a way to do it ?

Postgres COPY command parameters

Hi,
We faced problem when copying data because COPY command is always with parametres "FROM STDIN WITH DELIMITER e'\x1f' NULL '' ENCODING 'UTF-8'". It is interpreting ''(empty string) as NULL and fails on not-null constraint.

See COPY documentation: https://www.postgresql.org/docs/10/sql-copy.html
"NULL - Specifies the string that represents a null value. The default is \N (backslash-N) in text format, and an unquoted empty string in CSV format. You might prefer an empty string even in text format for cases where you don't want to distinguish nulls from empty strings."

But in our case we want to "distinguish nulls from empty strings"

Is ReplicaDB support ODBC ?

I have new project feeding data from Netsuite thru ODBC, so really know ReplicaDB support bulk load via ODBC currently, many thanks

MSSQL reserved word 'FOR' as column name

I'm not able to escape reserved words in the column name, such as [FOR]. I already tried square brackets, single and double quotes. Any idea how to overcome this?
Thanks

Oracle to PostgreSQL datasync in incremental mode issue

Hi, I am trying to migrate a table from oracle to Postgres in incremental mode but it keeps erroring. I am missing something. Can you please share a sample example on how this can be achieved.

I have attached the error I am getting here.

FDB5ABEE-A188-4DDA-B49C-890C8395F53B

Comparison with exisiting data in INCREMENTAL mode

In incremental mode is there a comparison being done to see which rows already exist and updating only whats missing from source to sink?
I am trying to understand the difference between complete and incremental modes. Incremental I thought is supposed to be for merging and updating whats missing. But what I see is we are actually truncating the data and then inserting everything again.

Reading in a .csv into Oracle - ERROR ReplicaDB:136 Got exception running ReplicaDB

I am using the latest revision.

source.connect=file://C:\ReplicaDB\files\HRL_DM_KEYS_ITEM_JMC_ALL_25651455.csv
source.columns=KEY, ITEMBASEPEOORGANIZATIONID, ITEMBASEPEOLASTUPDATEDATE, ITEMBASEPEOINVENTORYITEMID, KEY2
source.connect.format=DEFAULT
source.connect.parameter.columns.types=integer, integer, varchar, integer, integer
source.connect.parameter.format.firstRecordAsHeader=true
source.connect.parameter.format=DEFAULT
source.connect.parameter.format.delimiter=,
source.connect.parameter.format.recordSeparator=\n
source.connect.parameter.format.firstRecordAsHeader=true
source.connect.parameter.format.ignoreEmptyLines=true
source.connect.parameter.format.nullString=
source.connect.parameter.format.ignoreSurroundingSpaces=true
source.connect.parameter.format.trim=true
sink.connect=jdbc:oracle:thin:@DHFC.WORLD

I am trying to read in a .csv into Oracle. No matter what options I set on any CSV, I get the following error:

2022-11-06 14:08:42,208 ERROR ReplicaDB:136 Got exception running ReplicaDB:
java.util.concurrent.ExecutionException: java.lang.IllegalAccessError: superclass access check failed: class org.replicadb.rowset.CsvCachedRowSetImpl (in unnamed module @0x646be2c3) cannot access class com.sun.rowset.CachedRowSetImpl (in module java.sql.rowset) because module java.sql.rowset does not export com.sun.rowset to unnamed module @0x646be2c3
at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:?]
at java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:?]
at org.replicadb.ReplicaDB.processReplica(ReplicaDB.java:117) [ReplicaDB-0.13.0.jar:0.13.0]
at org.replicadb.ReplicaDB.main(ReplicaDB.java:47) [ReplicaDB-0.13.0.jar:0.13.0]
Caused by: java.lang.IllegalAccessError: superclass access check failed: class org.replicadb.rowset.CsvCachedRowSetImpl (in unnamed module @0x646be2c3) cannot access class com.sun.rowset.CachedRowSetImpl (in module java.sql.rowset) because module java.sql.rowset does not export com.sun.rowset to unnamed module @0x646be2c3
at java.lang.ClassLoader.defineClass1(Native Method) ~[?:?]
at java.lang.ClassLoader.defineClass(ClassLoader.java:1013) ~[?:?]
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:150) ~[?:?]
at jdk.internal.loader.BuiltinClassLoader.defineClass(BuiltinClassLoader.java:862) ~[?:?]
at jdk.internal.loader.BuiltinClassLoader.findClassOnClassPathOrNull(BuiltinClassLoader.java:760) ~[?:?]
at jdk.internal.loader.BuiltinClassLoader.loadClassOrNull(BuiltinClassLoader.java:681) ~[?:?]
at jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:639) ~[?:?]
at jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188) ~[?:?]
at java.lang.ClassLoader.loadClass(ClassLoader.java:521) ~[?:?]
at org.replicadb.manager.file.CsvFileManager.init(CsvFileManager.java:207) ~[ReplicaDB-0.13.0.jar:0.13.0]
at org.replicadb.manager.LocalFileManager.makeSourceConnection(LocalFileManager.java:53) ~[ReplicaDB-0.13.0.jar:0.13.0]
at org.replicadb.manager.SqlManager.getConnection(SqlManager.java:103) ~[ReplicaDB-0.13.0.jar:0.13.0]
at org.replicadb.ReplicaTask.call(ReplicaTask.java:43) ~[ReplicaDB-0.13.0.jar:0.13.0]
at org.replicadb.ReplicaTask.call(ReplicaTask.java:15) ~[ReplicaDB-0.13.0.jar:0.13.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
at java.lang.Thread.run(Thread.java:833) ~[?:?]
2022-11-06 14:08:42,242 INFO ReplicaDB:54 Total process time: 1091ms

This is the sample data I have in my CSV:
KEY,ITEMBASEPEOORGANIZATIONID,ITEMBASEPEOLASTUPDATEDATE,ITEMBASEPEOINVENTORYITEMID,KEY2
202112,300001621140062,2021-12-13T21:27:14.901+00:00,300000383134546,202112
202112,300001621140062,2021-12-13T21:26:55.100+00:00,300000383134564,202112
202112,300000117868180,2021-12-17T15:41:50.009+00:00,100000694427502,202112
202112,300000117868164,2021-12-17T15:41:59.034+00:00,100000694427503,202112

Any suggestions? Thank you

source query not working with postgres

Hi,

I'm trying to do a migration based on a source query from postgres to postgres with the following command : --source-query SELECT * FROM kp_room and I receive the following error :

2020-01-10 10:37:11,681 DEBUG PostgresqlManager:298 Calculating the chunks size with this sql: SELECT abs(count() / 4) chunk_size, count() total_rows FROM ( SELECT )
2020-01-10 10:37:11,690 ERROR ReplicaDB:83 Got exception running ReplicaDB:
org.postgresql.util.PSQLException: ERROR: subquery in FROM must have an alias
Indice : For example, FROM (SELECT ...) [AS] foo.
Position : 64
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2422) ~[postgresql-42.2.1.jar:42.2.1]
at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2167) ~[postgresql-42.2.1.jar:42.2.1]
at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:306) ~[postgresql-42.2.1.jar:42.2.1]
at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441) ~[postgresql-42.2.1.jar:42.2.1]
at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365) ~[postgresql-42.2.1.jar:42.2.1]
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:307) ~[postgresql-42.2.1.jar:42.2.1]
at org.postgresql.jdbc.PgStatement.executeCachedSql(PgStatement.java:293) ~[postgresql-42.2.1.jar:42.2.1]
at org.postgresql.jdbc.PgStatement.executeWithFlags(PgStatement.java:270) ~[postgresql-42.2.1.jar:42.2.1]
at org.postgresql.jdbc.PgStatement.executeQuery(PgStatement.java:224) ~[postgresql-42.2.1.jar:42.2.1]
at org.replicadb.manager.PostgresqlManager.preSourceTasks(PostgresqlManager.java:299) ~[ReplicaDB-0.6.0.jar:0.6.0]
at org.replicadb.ReplicaDB.main(ReplicaDB.java:59) [ReplicaDB-0.6.0.jar:0.6.0]

Denodo 7 to Oracle 11

Hi @osalvador, we have encountered a problem integrating data using ReplicaDB from a Denodo 7 source to an Oracle 11 destination. The null value in an INTeger data type in the source dumps it in the destination as a zero (0) instead of a null in a field defined as Number (10). Attached configuration file.
Regards.
ReplicadB_conf.docx

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.