psalmody / databridge Goto Github PK
View Code? Open in Web Editor NEWProvides bridging utilities for moving data across multiple sources and destinations.
License: MIT License
Provides bridging utilities for moving data across multiple sources and destinations.
License: MIT License
Not needed if files have appropriate date labels at beginning of filename.
Throw errors to keymetrics for schedule
Index.js wrapper could:
var DataBridge = require('databridge')({
//setup options or config file
});
var dbBatch = DataBridge.batch;
var dbBridge = DataBridge.bridge;
How would this function with scheduler/service/pm2?
dest/mongo not currently setup for creating indexes.
Make setup.js format creds files and create them.
Maybe also setup default batch.json file?
TSV destination / source (with ftp?)
Even when columns are in correct order in database, dest/oracle is sometimes returning them shifted - right order but starting part way through the column list.
Anything over 1000 rows gives an error.
Ideas:
setup.js
contains both cl interface and actual methods for creating config. Separate these. Create spec for setup utility.
Would be nice to manage bind variables via script instead of manual JSON configuration.
Add an XML source/destination parser.
Format:
YYYY-MM-DD-HH-MI-SS.filename-or-batch.log
Add a destination and source for ftp csv files
Can't get XLSX source to work. Should change to different package, maybe xlsx.
Would be useful to run scripts in the scheduled service to allow for moving csv files around for example.
Need to have log check for file existence before overwriting.
Should have option to e-mail logs from schedule
In dest/oracle@master, currently limit to number of rows can be inserted
Rework in branch feat/oracle passes 1000-row test but fails with larger rows (FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory)
Possibly due to large amounts of async callbacks
Could an event emitter model work better?
Source CSV can't handle apostrophe.
Right now the sqlldr log is being ignored - need to put it in a tmp file and then provide it to the user if there is a problem for debugging purposes.
pm2 schedule.log doesn't have any pagination (dumps to same file) and overwrites old logs.
Could log better to files or console or e-mail even if changed schedule.js to use logs instead
Rather than update which appends rows, need a truncate option to delete all rows but not drop table
Test should NOT require user command-line input. /spec/bind-query.js
Should search for an installed source rather than defaulting to mssql
Update README - either npm install <dbmodule>
in the local files directory or npm install -g <dbmodule>
Add a mongo source module.
Currently testing relies on data in the local installation (csv, etc.). Should change this to use some other data set.
LOAD DATA INFILE
has issues with HTML in text
Maybe all numbers should be SQL type Number or Float rather than INT to deal with Javascript handling of number type
handling _DEC and _GPA should be better - possibly only exposed in opfile - and test more than just the first row of data - like test 50 rows and if 100% are number or null then use number
Make pm2 log file rollover with filename format:
YYYY-MM-DD-HH-MI-SS.schedule.log
Data with newline characters has all kinds of problems due to tab-delimited conversion.
AKA How to use require('databridge');
in node.
No default test/batch included in the repo, test fails on clean install.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.