anwalsh / migrationsourcevalidator Goto Github PK
View Code? Open in Web Editor NEWValidates the source MongoDB Instance for common pitfalls encountered during migrations.
Validates the source MongoDB Instance for common pitfalls encountered during migrations.
If an index specifies a collation and is a v:1 index - we should fail that definition as the option is invalid for that index version.
I can contribute this, but at present I just need to know the intended options are for the tool. To get up and running, these are the following steps to take:
$ python validator.py "<URI>"
We should also specify the role requirements for the MongoDB user running the tool.
This would obviously be a living document, with usage details updated as the project progresses. I imagine you're waiting for it to be more feature complete before adding this, but I figured I'd get the issue up here because it's something that I want ๐
Build a test suite for verifying the following:
Currently this process is largely manual in nature. This can be streamlined and made more repeatable.
Got the following error when attempting to use the tool on my Atlas cluster:
Traceback (most recent call last):
File "validator.py", line 14, in <module>
s_topology = sn.SourceNamespaces(args.uri)
File "/Users/juliant/Dropbox/Developer/MigrationSourceValidator/SourceNamespaces.py", line 17, in __init__
self.namespaces = self._get_namespaces()
File "/Users/juliant/Dropbox/Developer/MigrationSourceValidator/SourceNamespaces.py", line 29, in _get_namespaces
for coll in self.gen_get_collections(db):
File "/Users/juliant/Dropbox/Developer/MigrationSourceValidator/SourceNamespaces.py", line 53, in gen_get_collections
data = self.client[db].command('collstats', coll, scale=1024)
File "/usr/local/lib/python2.7/site-packages/pymongo/database.py", line 614, in command
codec_options, session=session, **kwargs)
File "/usr/local/lib/python2.7/site-packages/pymongo/database.py", line 514, in _command
client=self.__client)
File "/usr/local/lib/python2.7/site-packages/pymongo/pool.py", line 579, in command
unacknowledged=unacknowledged)
File "/usr/local/lib/python2.7/site-packages/pymongo/network.py", line 150, in command
parse_write_concern_error=parse_write_concern_error)
File "/usr/local/lib/python2.7/site-packages/pymongo/helpers.py", line 155, in _check_command_response
raise OperationFailure(msg % errmsg, code, response)
pymongo.errors.OperationFailure: Namespace survey.data-view is a view, not a collection
The validation tool should know to skip over MongoDB views within a database when validating for migration, and only test the underlying collections themselves.
Index Size validation as implemented is naive and not correctly validating per the spec here:
The total size of an index entry, which can include structural overhead depending on the BSON type, must be less than 1024 bytes.
The key : value
should be converted to a BSON representation and compared to the 1024 byte limit instead. This would actually validate for indexes exceeding the soft cap on length.
Additional resources:
Using a shell command for gathering the index details from a source:
rs.slaveOk();
db.getMongo().getDBNames().forEach(function(d) {
if (d != "config" && d != "local" && d != "admin") {
var curr_db = db.getSiblingDB(d);
curr_db.getCollectionNames().forEach(function(coll) {
var c = curr_db.getCollection(coll);
if (typeof c != "function") {
printjson(c.getIndexes());
printjson(c.stats());
}
});
}
});
Be able to validate the index shapes therein.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.