Git Product home page Git Product logo

dataset-generator's Issues

all generated numbers are in string format

With numbers in string format, queries using commands such "$gt" would be futile. The package should support inserting various BSON types or extend JSON types to mongodb.

Array improvement to better support coordinates

  • 1. Add support for generating array of arrays. This can be very useful when handling coordinates . 2d-sphere index.
  • 2. Add new utility method for generating two-item-long array representing coordinates. This representation seems to be very popular (at least in the mongodb docs)

Decouple from mongodb entirely + integration prerequisites

Now that we have completed #26, we're ready to integrate -datasets into mongoscope! Because mongoscope-server has to manage all of the connections, we'll need to decouple from the mongo driver here.

  • Merge populator.js and operations/mongo.js
  • Move populator options/cli to a new bin in mongoscope-importer
  • Use -importer as a devDependency in -datasets so that tests and their helpers do not need to be rewritten

Support db refs

  • Model One-to-Many Relationships with Document References: Presents a data model that uses references to describe one-to-many relationships between documents.

TransformStream support

A TransformStream takes a ReadableStream + configuration and emits the transformed result. Consider the following bash expressions

cat user-schema.json | mongodb-datasets  -n 10 > user-data.json
mongodb-datasets  -n 10 < user-schema.json > user-data.json
mongodb-datasets  -n 10 < user-schema.json > mongodb-importer mongodb://localhost:27017/demo.users --
export TOKEN=`curl -X POST http://localhost:29017/api/v1/token -d seed="localhost:27017" -H "Accept: text/plain"`;
mongodb-datasets  -n 10 < user-schema.json > curl -X POST http://localhost:29017/api/v1/import/demo.users -H "Authorization: Bearer ${TOKEN}" 

What we're doing in everyone of these cases is exactly the same: inserting mongodb-datasets as a transform! And because transform streams are just plumbing, adding support for this means all of the above are trivial.

The simplest way to test:

var fs = require('fs'), 
  md = require('mongodb-datasets');

fs.createReadStream('./examples/user-schema.json')
  .pipe(md.createGeneratorStream(10))
  .pipe(fs.createWriteStream('./examples/user-data.json');

Pre-packaged tarballs to deliver to docs team

After #10 #11 #12 are complete, generate a reasonable set size (~1000 documents), run mongoexport for each subsection of the docs the schemas map to (so 10 tarballs in total), and upload them to a release of mongodb-datasets. We'll then coordinate with the docs team for integration of these into the public docs.mongodb.org.

More powerful ArrayField

  • enable array config object (the second element of a configurable array) to have access to other fields. e.g. employees: [ "{{_$config}}", {size: "{{this.numOfEmployees}}"}, "{{name()}}" ].
  • support non-repetitive enumeration. e.g. [ "{{_$config}}", { size: 2, enum: ['a', 'b', 'c'] }, "{{enum}}" ] may produce ['b', 'a']

support generating Date object

For now, all the generated content is in string format, whereas ISODate is a very useful datatype for MongoDB users. We need a way to generate Date object so that it is stored as ISODate in db.

Add example schemas for "Model Relationships Between Documents"

The docs define 3 examples for modeling relationships between documents, detailed here http://docs.mongodb.org/manual/applications/data-models-relationships/.

  • Model One-to-One Relationships with Embedded Documents: Presents a data model that uses embedded documents to describe one-to-one relationships between connected data.
  • Model One-to-Many Relationships with Embedded Documents: Presents a data model that uses embedded documents to describe one-to-many relationships between connected data.

Higher level packaging

as discussed, how could we support

  • versioning of tpl language
  • multiple collections
  • indexes
  • schema -> joi validation object

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.