Git Product home page Git Product logo

dataset-generator's Introduction

mongodb-js

Hi, welcome to mongodb-js, a collective of JavaScript developers and code focused on interacting with MongoDB.

Setup
A step by step guide for getting your environment ready to work with JavaScript.
New To JS
Helpful links and resources for developing with modern JavaScript.
Modules
A glossary of modules we use and a bit about why each is special.
Services
A glossary of commonly used services you might see from time to time.
Editor Plugins
Our favorite plugins and add-ons that make us happier and more productive.
Transferring Projects
How and why to transfer projects from your personal GitHub account into the mongodb-js organization.
Quality
Documentation, versioning, style, and testing measures we deploy to keep the entire organization green.
Where can I help?
GitHub's open issues dashboard aggregates across all of mongodb-js and is a great place to find easy ways you can start contributing today.

dataset-generator's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

dataset-generator's Issues

Support db refs

  • Model One-to-Many Relationships with Document References: Presents a data model that uses references to describe one-to-many relationships between documents.

Decouple from mongodb entirely + integration prerequisites

Now that we have completed #26, we're ready to integrate -datasets into mongoscope! Because mongoscope-server has to manage all of the connections, we'll need to decouple from the mongo driver here.

  • Merge populator.js and operations/mongo.js
  • Move populator options/cli to a new bin in mongoscope-importer
  • Use -importer as a devDependency in -datasets so that tests and their helpers do not need to be rewritten

Add example schemas for "Model Relationships Between Documents"

The docs define 3 examples for modeling relationships between documents, detailed here http://docs.mongodb.org/manual/applications/data-models-relationships/.

  • Model One-to-One Relationships with Embedded Documents: Presents a data model that uses embedded documents to describe one-to-one relationships between connected data.
  • Model One-to-Many Relationships with Embedded Documents: Presents a data model that uses embedded documents to describe one-to-many relationships between connected data.

Higher level packaging

as discussed, how could we support

  • versioning of tpl language
  • multiple collections
  • indexes
  • schema -> joi validation object

all generated numbers are in string format

With numbers in string format, queries using commands such "$gt" would be futile. The package should support inserting various BSON types or extend JSON types to mongodb.

More powerful ArrayField

  • enable array config object (the second element of a configurable array) to have access to other fields. e.g. employees: [ "{{_$config}}", {size: "{{this.numOfEmployees}}"}, "{{name()}}" ].
  • support non-repetitive enumeration. e.g. [ "{{_$config}}", { size: 2, enum: ['a', 'b', 'c'] }, "{{enum}}" ] may produce ['b', 'a']

TransformStream support

A TransformStream takes a ReadableStream + configuration and emits the transformed result. Consider the following bash expressions

cat user-schema.json | mongodb-datasets  -n 10 > user-data.json
mongodb-datasets  -n 10 < user-schema.json > user-data.json
mongodb-datasets  -n 10 < user-schema.json > mongodb-importer mongodb://localhost:27017/demo.users --
export TOKEN=`curl -X POST http://localhost:29017/api/v1/token -d seed="localhost:27017" -H "Accept: text/plain"`;
mongodb-datasets  -n 10 < user-schema.json > curl -X POST http://localhost:29017/api/v1/import/demo.users -H "Authorization: Bearer ${TOKEN}" 

What we're doing in everyone of these cases is exactly the same: inserting mongodb-datasets as a transform! And because transform streams are just plumbing, adding support for this means all of the above are trivial.

The simplest way to test:

var fs = require('fs'), 
  md = require('mongodb-datasets');

fs.createReadStream('./examples/user-schema.json')
  .pipe(md.createGeneratorStream(10))
  .pipe(fs.createWriteStream('./examples/user-data.json');

support generating Date object

For now, all the generated content is in string format, whereas ISODate is a very useful datatype for MongoDB users. We need a way to generate Date object so that it is stored as ISODate in db.

Array improvement to better support coordinates

  • 1. Add support for generating array of arrays. This can be very useful when handling coordinates . 2d-sphere index.
  • 2. Add new utility method for generating two-item-long array representing coordinates. This representation seems to be very popular (at least in the mongodb docs)

Pre-packaged tarballs to deliver to docs team

After #10 #11 #12 are complete, generate a reasonable set size (~1000 documents), run mongoexport for each subsection of the docs the schemas map to (so 10 tarballs in total), and upload them to a release of mongodb-datasets. We'll then coordinate with the docs team for integration of these into the public docs.mongodb.org.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.