Git Product home page Git Product logo

migrator's Introduction

@reactioncommerce/migrator

npm (scoped) CircleCI

Command line interface for migrating MongoDB databases.

Features

Although this CLI tool was created for Reaction Commerce, it is a general purpose MongoDB data migration tool that you can use for any project.

  • Migrations are scoped to namespaced "tracks", allowing you to track the version of different areas of data within the same database.
  • Store all desired versions in a config file, or different config files per environment. Commit these to version control to keep a record of what data versions are in each of your environments.
  • Tracks are locked while they are being migrated, preventing two people or two migration runner workers from trying to run the same migration.
  • Although each migration step for a single track necessarily runs in series, track migrations happen in parallel using Node worker threads, which means migrating data will take less time.
  • Migration progress is reported on screen allowing you to estimate how long migrations will take to finish.
  • Migration history is stored and can be viewed or cleared with CLI commands
  • One level of version branching is supported. Versions can be either an integer or two integers separated with a dash (e.g., "2-1"). This causes the track to split, so that you can migrate up to "2-1" from "2-0", but if you migrate up to "3-0" from "2-0", the "2-1" migration will not run. This is useful if you need to migrate some data for an older release that you still support without affecting your current release.

Why we built this

Put simply, we would have loved to use an out-of-the-box solution for MongoDB data versioning and migrations. We looked at and tried a few, including:

https://www.npmjs.com/package/migrate-mongo https://www.npmjs.com/package/db-migrate https://www.npmjs.com/package/mongodb-migrations https://github.com/emmanuelbuah/mgdb-migrator https://www.npmjs.com/package/mongrator

But none of these had everything we needed. Some were just not a great user experience. Others were not actively maintained.

Some of the specific problems were:

  • The need to run up or down rather than just specifying a desired version
  • Insecure configuration
  • Difficult to run on remote servers / with GitOps
  • No support for multiple versioning tracks
  • APIs specific to Mongoose
  • No provided version checking function
  • No support for migration code living in NPM packages
  • Slow
  • Doesn't track/display migration history

Usage

This CLI looks for a config file in the current directory. We recommend that you create a new directory in which this config file and a package.json file will live, and commit it to version control.

You can also have different config files per environment, which allows this one "migrations" repo to reflect the current "desired state" of all your data in all your environments.

To create the directory and install this package in it, run the following commands:

mkdir migrations
cd migrations
echo "12.14.1" > .nvmrc
nvm use
npm init -y
npm i @reactioncommerce/migrator
touch migrator.config.js

Then edit package.json and set "type": "module".

The main thing in the object exported by the config file is an array of tracks:

// migrator.config.js
export default {
  tracks: [
    // Migrations exported by an NPM package
    {
      namespace: "my-namespace",
      package: "npm-package-name",
      version: 2
    },
    // Ad-hoc migrations located in the current directory
    {
      namespace: "my-namespace",
      path: "./migrations/index.js",
      version: 5
    }
  ]
};

CLI Commands

To see all commands, run any of the following:

migrator
migrator -h
migrator --help

To see additional docs and options for a specific command, run any of the following:

migrator <command> -h
migrator <command> --help

# Example
migrator migrate --help

migrator report

To view a report of current data versions versus desired data versions and which migrations are needed, edit migrator.config.js to set all the versions to your desired versions. Then run:

MONGO_URL=mongodb://localhost:27017/dbname migrator report

Important: Set MONGO_URL to the MongoDB connection URL with correct database name.

To view the report for a specific environment, edit migrator.config-<env>.js and then run:

MONGO_URL=mongodb://localhost:27017/dbname migrator report <env>

migrator migrate

To view a report of current data versions versus desired data versions and which migrations are needed and then choose whether to run migrations, edit migrator.config.js to set all the versions to your desired versions. Then run:

MONGO_URL=mongodb://localhost:27017/dbname migrator migrate

Important: Set MONGO_URL to the MongoDB connection URL with correct database name.

To view the report for a specific environment, edit migrator.config-<env>.js and then run:

MONGO_URL=mongodb://localhost:27017/dbname migrator migrate <env>

If you don't want to be prompted to decide whether to run them (recommended only for CI), add -y:

MONGO_URL=mongodb://localhost:27017/dbname migrator migrate -y
# OR
MONGO_URL=mongodb://localhost:27017/dbname migrator migrate <env> -y

migrator unlock-track

To unlock a track if you get errors about it being locked but you're sure that nothing is running those migrations right now, run:

MONGO_URL=mongodb://localhost:27017/dbname migrator unlock-track <namespace>

migrator history

To view a list of all previous migration runs for a track, run:

MONGO_URL=mongodb://localhost:27017/dbname migrator history <namespace>

migrator clear-history

To clear the list of all previous migration runs for a track, run:

MONGO_URL=mongodb://localhost:27017/dbname migrator clear-history <namespace>

How to Publish a Package with Migrations

To be compatible with this tool, an NPM package with migrations must have an ES module export named "migrations". This must be an object with the following structure:

const migrations = {
  tracks: [
    {
      namespace: "something",
      migrations: {
        2: {
          up(context) {},
          down(context) {}
        }
      }
    }
  ]
}

The namespace should be something similar to your package name that will not collide with other packages that provide migrations.

The keys of the migration object are the database version numbers. These must be a single number (2) or two numbers separated by a dash (2-1) if you need to branch off your main migration path to support previous major releases. Only one branch level is allowed.

Version 1 is reserved as the assumed version before any migrations run. Versions 0 and below are invalid.

Each migration version must provide an up function.

Each migration version must provide one of the following for down:

  • A down function
  • down: "unnecessary" if a down function isn't needed
  • down: "impossible" if migrating down isn't possible due to some information having been deleted

Both types of functions receive a migration context, which has a connection to the MongoDB database and a progress function for reporting progress.

How to Migrate Data

The up and down functions should do whatever they need to do to move data from your N-1 or N+1 schema to your N schema. They must always be written as if there are millions of documents to convert, meaning they should use MongoDB bulk reads and writes and do updates in small batches.

If errors are thrown, they will be caught. In fact, throwing an error is the only way to stop the migration process and mark the migration as failed.

If you return a string from your up or down function, it will be stored as result in the migration history. Do not return anything other than a string, or undefined, or null. If you throw, the error message will be stored as result in the migration history instead.

While running, the migration function can and should report its progress by calling context.progress(percentDone). The migration function must return a Promise and when that promise resolves, the migration is considered done and the version for this namespace in the database is incremented. If the Promise is rejected, the migration is considered failed and the data may be in a partially migrated state.

Additionally, you can and should make use of MongoDB transactions in your function if you are migrating multiple related collections in a way that will cause problems if some updates succeed and others fail.

To avoid issues, we strongly suggest that you write idempotent migration code, that is, code that can be run multiple times and will do nothing, yet succeed, if the data is already migrated.

How to Check Data Version in App Code

After you've created and exported migrations for you package, the final step is to check the current migration version for each of your namespaces somewhere in your top-level or startup code, after you are connected to MongoDB but before you run any database commands. Do this by depending on the @reactioncommerce/db-version-check NPM package and calling the function it exports. Refer to the documentation for that package.

Commit Messages

To ensure that all contributors follow the correct message convention, each time you commit your message will be validated with the commitlint package, enabled by the husky Git hooks manager.

Examples of commit messages: https://github.com/semantic-release/semantic-release

Publication to NPM

The @reactioncommerce/migrator package is automatically published by CI when commits are merged or pushed to the master branch. This is done using semantic-release, which also determines version bumps based on conventional Git commit messages.

Developer Certificate of Origin

We use the Developer Certificate of Origin (DCO) in lieu of a Contributor License Agreement for all contributions to Reaction Commerce open source projects. We request that contributors agree to the terms of the DCO and indicate that agreement by signing-off all commits made to Reaction Commerce projects by adding a line with your name and email address to every Git commit message contributed:

Signed-off-by: Jane Doe <[email protected]>

You can sign-off your commit automatically with Git by using git commit -s if you have your user.name and user.email set as part of your Git configuration.

We ask that you use your real full name (please no anonymous contributions or pseudonyms) and a real email address. By signing-off your commit you are certifying that you have the right to submit it under the Apache 2.0 License.

We use the Probot DCO GitHub app to check for DCO sign-offs of every commit.

If you forget to sign-off your commits, the DCO bot will remind you and give you detailed instructions for how to amend your commits to add a signature.

License

Copyright 2020 Reaction Commerce

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

See the License for the specific language governing permissions and limitations under the License.

migrator's People

Contributors

akarshit avatar aldeed avatar dependabot[bot] avatar focusaurus avatar kieckhafer avatar mikemurray avatar spencern avatar zenweasel avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

migrator's Issues

Reserved characters in mongodb URL break parsing

Special shell characters like '$' and '@' break the url parser when used like:

MONGO_URL="mongodb://[email protected]:PasswordWith$@host1:27017,host2:27017/reaction?replicaSet=rs0&readPreference=primaryPreferred&authSource=admin" npx migrator migrate

Characters must be escaped:

MONGO_URL="mongodb://user\@domain.com:PasswordWith\$@host1:27017,host2:27017/reaction?replicaSet=rs0&readPreference=primaryPreferred&authSource=admin" npx migrator migrate

As is the error message is confusing and vague. Can documentation and error messages to ask uses to verify this? A simple error message asking to check for this will save users a lot of time and confusion.

cli shebang fails on linux due to varying env features

macos the shebang we are using to run node with arguments works OK, but on linux it fails because /usr/bin/env doesn't handle parsing on whitespace and running an executable with command line arguments.

./node_modules/.bin/migrator migrate                                              
/usr/bin/env: 'node --no-warnings --experimental-modules --experimental-json-modules': No such file or directory

This is discussed a bunch on stackoverflow. I think we'll need to write a wrapper script with a regular shebang that execs node with arguments.

Won't accept replica set URLs

When running this in a production environment your URL is not going to point at just a single server but the replica set. The code currently seems to assume that it will be pointing at a single server.

Warn about "impossible" downs

Warn and confirm before migrating if any of the up migrations have an "impossible" down. Ask them to be sure they have a current db backup.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.