Git Product home page Git Product logo

denormalize's Introduction

Denormalize

Simple denormalization for Meteor

Introduction

meteor add herteby:denormalize

In this readme, parent always refers to the documents in which the cache is stored, while child refers to the documents that will be cached.

Example: You have two collections - Users and Roles. The Users store the _id of any Roles they have been assigned. If you want each User to cache information from any Roles that are assigned to it, the Users would be the parents and the Roles would be the children, and it would be either a one or many relationship, depending on if a User can have multiple Roles. If you wanted each Role to store a list of all Users which have that role, the Roles would be the parents and the Users would be the children, and it would be an inverse or many-inverse relationship.

Collection.cache(options)

Posts.cache({
  type:'one',
  collection:Meteor.users,
  fields:['username', 'profile.firstName', 'profile.lastName'],
  referenceField:'author_id',
  cacheField:'author'
})
Property Valid values Description
type 'one', 'many', 'inverse' or 'many-inverse'
one: The parent stores a single child _id
many: The parent stores an array of child _ids
inverse: Each child stores a single parent _id
many-inverse: Each child stores an array of parent _ids
collection Mongo.Collection The "child collection", from which docs will be cached
fields Array of Strings or Object The fields to include in the cache. It can either look like ['username', 'profile.email'] or {username:1, profile:{email:1}}. For "many", "inverse" and "many-inverse", _id will always be included.
referenceField String For "one" and "many", the field on the parent containing _id of children. For "inverse" and "many-inverse", the field on the children containing the _id of the parent.
cacheField String The field on the parent where children are cached. Can be a nested field, like 'caches.field', but it can not be in the same top level field as the referenceField. For type:'one', cacheField will store a single child. For all others, it will store an array of children.
bypassSchema Boolean (optional) If set to true, it will bypass any collection2 schema that may exist. Otherwise you must add the cacheField to your schema.

Notes and clarification:

  • "one" and "inverse" are many-to-one relationships (with "one", a parent can only have one child, but many parents could have the same child). "many" and "many-inverse" are many-to-many relationships
  • When cacheField is an array (all types except "one"), the order of the children is not guaranteed.
  • When referenceField is an array, if it contains duplicate _ids, they will be ignored. The cacheField will always contain unique children.

Collection.cacheCount(options)

TodoLists.cacheCount({
  collection:Todos,
  referenceField:'list_id',
  cacheField:'counts.important',
  selector:{done:null, priority:{$lt:3}}
})

cacheCount() can be used on "inverse" and "many-inverse" relationships

Property Valid values Description
collection Mongo.Collection The collection in which docs will be counted
referenceField String The field on counted docs which must match the parent _id
cacheField String The field where the count is stored. Can be a nested field like 'counts.all'
selector Mongo selector (optional) Can be used to filter the counted documents. [referenceField]:parent._id will always be included though.
bypassSchema Boolean (optional) If set to true, it will bypass any collection2 schema that may exist. Otherwise you must add the cacheField to your schema.

Collection.cacheField(options)

Meteor.users.cacheField({
  fields:['profile.firstName', 'profile.lastName'],
  cacheField:'fullname',
  transform(doc){
    return doc.profile.firstName + ' ' + doc.profile.lastName
  }
})
Property Valid values Description
fields Array of Strings or Object The fields to watch for changes. It can either look like ['username', 'profile.email'] or {username:1, profile:{email:1}}
cacheField String Where the result is stored. Can be nested like 'computed.fullName'
transform Function (optional) The function used to compute the result. If not defined, the default is to return a string of all watched fields concatenated with ', '
The document provided to the function only contains the fields specified in fields
bypassSchema Boolean (optional) If set to true, it will bypass any collection2 schema that may exist. Otherwise you must add the cacheField to your schema.

Note: The transform function could also fetch data from other collections or through HTTP if you wanted, as long as it's done synchronously.

Migration

If you decide to add a new cache or change the cache options on a collection that already contains documents, those documents need to be updated. There are two options for this:

migrate(collectionName, cacheField, [selector])

import {migrate} from 'meteor/herteby:denormalize'
migrate('users', 'fullName')
migrate('users', 'fullAddress', {fullAddress:{$exists:false}})

This updates the specified cacheField for all documents in the collection, or all documents matching the selector. Selector can also be an _id.

autoMigrate()

import {autoMigrate} from 'meteor/herteby:denormalize'
autoMigrate() //should be called last in your server code, after all caches have been declared

When autoMigrate() is called, it checks all the caches you have declared against a collection (called _cacheMigrations in the DB) to see wether they need to be migrated. If any do, it will run a migration on them, and then save the options to _cacheMigrations, so that it won't run again unless you change any of the options. If you later for example decide to add another field to a cache, it will rerun automatically.

One thing it does not do is remove the old cacheField, if you were to change the name or remove the cache. That part you have to do yourself.

Note: it does not check the documents, it just checks each cache declaration, so it won't thrash your DB on server start going through millions of records (unless something needs to be updated).

Nested referenceFields

For "one" and "inverse", nested referenceFields are simply declared like referenceField:'nested.reference.field'

For "many" and "many-inverse", if the referenceField is an Array containing objects, a colon is used to show where the Array starts.

Example:

If the parent doc looks like this:

{
  //...
  references:{
    users:[{_id:'user1'}, {_id:'user2'}]
  }
}

The referenceField string should be 'references.users:_id'

Recursive caching

You can use the output (the cacheField) of one cache function as one of the fields to be cached by another cache function, or even as the referenceField. They will all be updated correctly. This way you can create "chains" connecting three or more collections.

In the examples below, all cache fields start with _, which may be a good convention to follow for all your caches.

Use cacheField() to cache the sum of all cached items from a purchase

Bills.cacheField({
  fields:['_items'],
  cacheField:'_sum',
  transform(doc){
    return _.sum(_.map(doc._items, 'price'))
  }
})

Caching the cacheFields of another cache

Bills.cache({
  cacheField:'_items',
  collection:Items,
  type:'many',
  referenceField:'item_ids',
  fields:['name', 'price']
})
Customers.cache({
  cacheField:'_bills',
  collection:Bills,
  type:'inverse',
  referenceField:'customer_id',
  fields:['_sum', '_items']
})

Using the cacheField of another cache as referenceField

Customers.cache({
  cacheField:'_bills2',
  collection:Bills,
  type:'inverse',
  referenceField:'customer_id',
  fields:['item_ids', '_sum']
})
Customers.cache({
  cacheField:'_items',
  collection:Items,
  type:'many',
  referenceField:'_bills2:item_ids',
  fields:['name', 'price']
})

Incestuous relationships

With this fun title I'm simply referring to caches where the parent and child collections are the same.

Meteor.users.cache({
  cacheField:'_friends',
  collection:Meteor.users,
  type:'many',
  referenceField:'friend_ids',
  fields:['name', 'profile.avatar']
})

This works fine, but there is one thing you can not do - cache the cacheField of a document in the same collection - in this example it would be caching the friends of a users friends. This would lead to an infinite loop and infinitely growing caches.

When are the caches updated?

The caches for cache() and cacheCount() are updated immediately and synchronously.

Posts.cache({
  cacheField:'_author',
  //...
})
Posts.insert({_id:'post1', author_id:'user1'})
Posts.findOne('post1')._author //will contain the cached user

cache() uses 5 hooks: parent.after.insert, parent.after.update, child.after.insert, child.after.update and child.after.remove. There are then checks done to make sure it doesn't do unnecessary updates.

Basically you should always be able to rely on the caches being updated. If they're not, that should be considered a bug.

However, to avoid a complicated issue with "recursive caching", the update of cacheField() is always deferred.

Meteor.users.cacheField({
  fields:['address', 'postalCode', 'city'],
  cacheField:'_fullAddress',
})
Meteor.users.insert({_id:'user1', ...})
Meteor.users.findOne('user1')._fullAddress //will not contain the cached address yet
Meteor.setTimeout(()=>{
  Meteor.users.findOne('user1')._fullAddress //now it should be there
}, 50)

Note: Since this package relies on collection-hooks, it won't detect any updates you do to the DB outside of Meteor. To solve that, you can call the migrate() function afterwards.

Testing the package

meteor test-packages packages/denormalize --driver-package=practicalmeteor:mocha

(Then open localhost:3000 in your browser)
The package currently has over 120 tests
Note: The "slowness warnings" in the results are just due to the asynchronous tests

denormalize's People

Contributors

donstephan avatar floriferous avatar herteby avatar jankapunkt avatar storytellercz avatar wreiske avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

denormalize's Issues

Nested referenceFields - One Meta Link

I'm almost finished with the integration, however One Meta type of links are not working.

I have the following scenario:
Author -> Profiles
Author Document: profileId: { _id: actualProfileId }

In this case, where nested reference fields are not an array, the cache does not work.
This is the sample config for both:

I20171126-16:08:30.473(2)? cache_authors ::::: { type: 'one',
I20171126-16:08:30.473(2)?   fields: { name: 1 },
I20171126-16:08:30.473(2)?   referenceField: 'profileId:_id',
I20171126-16:08:30.473(2)?   cacheField: 'profileCache',
I20171126-16:08:30.473(2)?   bypassSchema: false }
I20171126-16:08:30.473(2)? cache_author_profiles ::::: { type: 'inverse',
I20171126-16:08:30.473(2)?   fields: { name: 1 },
I20171126-16:08:30.474(2)?   referenceField: 'profileId:_id',
I20171126-16:08:30.474(2)?   cacheField: 'authorCache',
I20171126-16:08:30.474(2)?   bypassSchema: false }

authorCache is empty array
profileCache is missing completely

I tried using autoMigrate, migrate to no avail.

However! The rest of the links one, one inversed, many, many inversed, many meta, many meta inversed have worked flawlessly. Amazing.

Bypassing schema results in collection without hooks

I'm using this package through grapher, and when I define a cache with bypassSchema: true, this package throws the following error:

TypeError: Cannot read property 'insert' of undefined
W20190315-09:56:31.177(1)? (STDERR)     at ns.Collection.Mongo.Collection.cache (packages/herteby:denormalize/cache.js:112:28)
W20190315-09:56:31.177(1)? (STDERR)     at Linker._initDenormalization (packages/cultofcoders:grapher/lib/links/linker.js:444:33)
W20190315-09:56:31.177(1)? (STDERR)     at new Linker (packages/cultofcoders:grapher/lib/links/linker.js:28:14)
W20190315-09:56:31.177(1)? (STDERR)     at _.each (packages/cultofcoders:grapher/lib/links/extension.js:23:28)
W20190315-09:56:31.178(1)? (STDERR)     at Function._.each._.forEach (packages/underscore.js:147:22)
W20190315-09:56:31.178(1)? (STDERR)     at ns.Collection.addLinks (packages/cultofcoders:grapher/lib/links/extension.js:14:11)
W20190315-09:56:31.178(1)? (STDERR)     at links.js (imports/core/api/loans/links.js:14:1)
W20190315-09:56:31.178(1)? (STDERR)     at fileEvaluate (packages/modules-runtime.js:336:7)
W20190315-09:56:31.178(1)? (STDERR)     at Module.require (packages/modules-runtime.js:238:14)
W20190315-09:56:31.178(1)? (STDERR)     at Module.moduleLink [as link] (/Users/Florian/.meteor/packages/modules/.0.13.0.1aurolz.v3oz++os+web.browser+web.browser.legacy+web.cordova/npm/node_modules/reify/lib/runtime/index.js:38:38)
W20190315-09:56:31.178(1)? (STDERR)     at links.js (imports/core/api/links.js:1:1)
W20190315-09:56:31.178(1)? (STDERR)     at fileEvaluate (packages/modules-runtime.js:336:7)
W20190315-09:56:31.179(1)? (STDERR)     at Module.require (packages/modules-runtime.js:238:14)
W20190315-09:56:31.179(1)? (STDERR)     at Module.moduleLink [as link] (/Users/Florian/.meteor/packages/modules/.0.13.0.1aurolz.v3oz++os+web.browser+web.browser.legacy+web.cordova/npm/node_modules/reify/lib/runtime/index.js:38:38)
W20190315-09:56:31.179(1)? (STDERR)     at api-server.js (imports/core/api/api-server.js:1:1)
W20190315-09:56:31.196(1)? (STDERR)     at fileEvaluate (packages/modules-runtime.js:336:7)

From what I can see, this.parentCollection.after is undefined, meaning that when you get this._collection, that collection doesn't have hooks on it.

It might just be a question of putting the packages in the right order in .meteor/packages, but I've already tried putting matb33:collection-hooks before grapher, and before this package, without success.

cc: @theodorDiaconu

Selectors for parent collection cache

It would be an interesting addition to be able to provide a selector for the parent collection. For example:

// Only add a chairCount on furniture of type "table"
Furniture.cacheCount({
  collection: Furniture,
  referenceField: 'furnitureLink',
  cacheField: 'chairCount',
  parentSelector: { type: 'table' },
  selector: { type: 'chair' },
})

// Only add a legCount on furniture of type "chair"
Furniture.cacheCount({
  collection: Legs,
  referenceField: 'furnitureLink',
  cacheField: 'legCount',
  parentSelector: { type: 'chair' },
})

This allows "polymorphic" collections to have different caches depending on their category, type, etc.

It is mostly done for keeping things clean, but one could also imagine using different cache configurations with the same name:

Here's a sarcastic example ๐Ÿคฃ:

Users.cacheCount({
  collection: Users,
  referenceField: 'friendIds',
  cacheField: 'americanFriends',
  parentSelector: { citizenShip: 'american' },
  selector: { citizenShip: 'american' },
})

Users.cacheCount({
  collection: Users,
  referenceField: 'friendIds',
  cacheField: 'americanFriends',
  parentSelector: { citizenShip: 'european' },
  selector: { citizenShip: { $in: ['north-american', 'south-american', 'american'] } },
})

You would be responsible for maintaining non-overlapping parent selectors of course, otherwise it will not work.

upgrade matb33:collection-hooks dependency

This currently constrains the collection hooks package not to upgrade past 0.8.4 and the current version is 1.0.1.

I'd be happy to submit a pull request if there is a chance it will get merged and published.

Cache client side

Hello Herteby,

thank you for the great plugin.
The only thing I missed is client side denormalisation.
Is there an implementation?

Kind regards
Lukas

Migration on collections with hooks in place

Hi!

I have before.find and before.update hooks in place to realize authorization on my collections.

Normally I use .direct to bypass those hooks in the migrations. Is there a possibility to use .direct when running the migrate() function?

Thanks!

Many to many relationships

Hi Simon,

What is the best way to have many to many relationships? at which point does this cache the doc, on insert?

Thanks,
Chat.

Meteor 3.0 support

Is your feature request related to a problem? Please describe.
Support Meteor 3.0.

Describe the solution you'd like
Switch to *Async methods.

Describe alternatives you've considered
n/a

Additional context
I think we'd like to merge as much as possible from the open PRs and then start working on Meteor 3.0 support.

[>1.6.0.1] Error: Match error: Failed Match.Where validation in field collection

Hi,

In any Meteor version above, but not including 1.6.0.1, I'm getting this error:

https://pastebin.com/raw/CP9Dgep8

Here are the contents of my versions file after updating to 1.7.0.3:

[email protected]
[email protected]
aldeed:[email protected]
aldeed:[email protected]
aldeed:[email protected]
aldeed:[email protected]
aldeed:[email protected]
aldeed:[email protected]
aldeed:[email protected]
aldeed:[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
babrahams:[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
bredikhin:[email protected]
[email protected]
[email protected]
[email protected]
cfs:[email protected]
[email protected]
chuangbo:[email protected]
[email protected]
cultofcoders:[email protected]
cultofcoders:[email protected]
cultofcoders:[email protected]
daoli:[email protected]
dburles:[email protected]
dburles:[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
drewy:[email protected]
drewy:[email protected]
[email protected]
east5th:[email protected]
east5th:[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
fortawesome:[email protected]
francocatena:[email protected]
[email protected]
gwendall:[email protected]
herteby:[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
jperl:[email protected]
[email protected]
juliancwirko:[email protected]
juliancwirko:[email protected]
kadira:[email protected]
lai:[email protected]_1
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
matb33:[email protected]
mdg:[email protected]
[email protected]
[email protected]
[email protected]
meteorhacks:[email protected]
meteorhacks:[email protected]
meteorhacks:[email protected]
meteorspark:[email protected]
meteortoys:[email protected]
meteortoys:[email protected]
meteortoys:[email protected]
meteortoys:[email protected]
meteortoys:[email protected]
meteortoys:[email protected]
meteortoys:[email protected]
meteortoys:[email protected]
meteortoys:[email protected]
meteortoys:[email protected]
meteortoys:[email protected]
meteortoys:[email protected]
meteortoys:[email protected]
meteortoys:[email protected]
meteortoys:[email protected]
meteortoys:[email protected]
meteortoys:[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
momentjs:[email protected]
[email protected]
[email protected]
[email protected]
mrt:[email protected]
msavin:[email protected]
msavin:[email protected]
nadeemjq:[email protected]
nadeemjq:[email protected]
[email protected]
[email protected]
[email protected]
ogourment:[email protected]
ongoworks:[email protected]
[email protected]
ostrio:[email protected]
percolate:[email protected]
percolate:[email protected]
[email protected]
raix:[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
reywood:[email protected]
[email protected]
rzymek:[email protected]
sacha:[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
simple:[email protected]
[email protected]
softwarerero:[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
staringatlights:[email protected]
staringatlights:[email protected]
steffo:[email protected]
[email protected]
[email protected]
tap:[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
[email protected]
underscorestring:[email protected]
[email protected]
useraccounts:[email protected]
useraccounts:[email protected]
vazco:[email protected]
[email protected]
[email protected]
yagni:[email protected]
zimme:[email protected]
zimme:[email protected]

I'm seeing this on all version of Meteor after 1.6.0.1. Possibly related to Cult Of Coders packages, however upgrading them doesn't help. (@theodorDiaconu)

What might be happening?

Denormalization makes Collection.update() synchronous despite callback

Hi,

My case:

Documents from the Users collection has been denormalized into an authorCache field in various different other collections. If a User creates a Question, then that Question-document will have an authorCache field with the Users emails.

When I update the Users emails, all their Question-docs get updated - as expected. However, if there are thousands of docs, then this takes time.

So, I was hoping that Users.update(selector, modifier, {upsert: true}, callback ), when ran on the server, would make the update async and non-blocking. This is the expected behaviour according to Meteor docs.

But, this is not the case when using herteby:denormalize (via cult-of-coders:grapher).

Is there a way to make the Users.update() call return immediately, even when caches must be updated?

Many-to-One unique relationship

I'm trying to get a cache working where the relationship is "many-to-one unique", but I think this package doesn't handle it. Here's an example:

I have a collection of Authors, and a collection of Comments. The authors store an array of commentIds, and I want each comment to denormalize the author, without it being an array.

Authors // { commentIds: ['commentId1', 'commentId2'] }

// What I want
Comments: // { authorCache: { _id: 'authorId', name: 'John' } }

// What I get
Comments: // { authorCache: [{ _id: 'authorId', name: 'John' }] }

Usually many-to-one means there can be many authors linking to the same comment, but that's not the case here, so the cache is not correct.

Denormalizing nested fields trigger unwanted updates when other part of the nested field is modified

If one collection is referencing nested fields as such :

Task.cache({
    type: 'one',
    collection: Meteor.users,
    fields: ['profile.firstName', 'profile.lastName'],
    referenceField: 'userId',
    cacheField: 'userCache'
})

Then an update of the profile will trigger unwanted cascading updates that could result in poor application performance depending on collection sizes :

Meteor.users.update({_id: "some id"}, { '$set': { 'profile.connexionDate': new Date() } })

I will try to work on a PR to fix this. Any hints on a proper way to treat this issue will be appreciated

High memory usage and slow to update large numbers of parent documents

The underlying meteor-collection-hooks performs two fetch()s on update() for any collection with an after.update hook defined.

  • the first to get the ids of all docs matching the selector
  • the second to iterate over these docs post-update and fire the after hooks

This can be both slow for large numbers of docs, as well as expensive on memory as it is doing a fetch() of all docs rather than iterating over the cursor.

This denormalize package adds a parentCollection.after.update hook, but also calls parentCollection.update() in any relevant childCollection mutation hook to maintain the caches. The result is that on any child document mutation there is a chain of hooks causing the 2x fetch(), e.g.:

childCollection.update()
  -> childCollection.after.update()
    -> parentCollection.update()
      -> parentCollection.after.update() // 2x fetch()

Related: Meteor-Community-Packages/meteor-collection-hooks#259

Suggestion

Denormalize simply needs to do an parentCollection.updateMany() to update the caches without the extra pre-fetching of ids the hooks do to support arbitrary selectors. Perhaps this package should wrap the Mongo API mutators directly, similar to how the hooks tie in, so as to avoid the chain of hook logic. The one downside I see is exactly that: are there other hooks expected to be chained by the cache update, or even chained denormalization, which this would break? Perhaps this could be an opt-in alternative for maintaining the cache if the user does not require triggering hooks/chaining via the cache update.

Split the cache function

Your cache functions has become very big. It's very hard to maintain.

Create a lib folder and split the functionality for many, single, inversed in separate files and functions. Because this way you do the check every time for type on every after update hook.

And if they re-use thing among them modularize those functions too.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.