viacominc / data-point Goto Github PK
View Code? Open in Web Editor NEWJavaScript Utility for collecting, processing and transforming data
License: Apache License 2.0
JavaScript Utility for collecting, processing and transforming data
License: Apache License 2.0
This issue was uncovered through #44
the following lines of:
Hash entity:
data-point/packages/data-point/lib/entity-types/entity-hash/reducer.js
Lines 136 to 139 in 9d4415e
Collection entity:
data-point/packages/data-point/lib/entity-types/entity-collection/reducer.js
Lines 105 to 108 in 9d4415e
May cause unexpected behaviors because they happen at entity resolver level. this means their before
/after
will get executed regardless.
This would be a new feature, the idea is to use the same pattern for entity mapping which is to add []
at the end of an entity id:
hash:Foo[]
this feature would expect the same mapping behavior but using a ReducerPath:
$some.path[]
By adding a []
at the end of a ReducerPath the developer should expect to get a mapping of the path passed.
const input = [
{ a: { b: { c: 'foo' }}},
{ a: { b: { c: 'bar' }}},
]
dataPoint.transform('$a.b.c[]', input)
.then( acc => {
// ['foo', 'bar']
console.log(acc.value)
})
For conditional execution, we already have the control entity. But there are moments where I have needed to just conditionally execute an entity depending on the value being passed.
Also, one drawback of using the control entity is that code will execute the entire entity lifecycle to just resolve a condition, which in many cases makes sense, but in others, it might be too much/too expensive.
Feature proposal #15 is to use the acc.resolve
method, but it might not go anywhere soon (see comments on this).
spec:
'?<entityId>'
Usage:
// throws error if input is null/undefined
dataPoint.transform('hash:Foo', undefined)
.catch(e => e)
.then( acc => {
expect(acc).toBeInstanceOf(Error)
})
// with the ? operator, to just return undefined with out throwing an error
dataPoint.transform('?hash:Foo', undefined).then( acc => {
expect(acc.value).toEqual(undefined)
})
Problem description:
It's difficult to know who has contributed to the project or what they've contributed.
Suggested solution:
Consider adding https://github.com/kentcdodds/all-contributors to this repository to show a table of contributors on the main README.
Documentation in Basic Express Example has an error
https://github.com/ViacomInc/data-point#integrations
Lines to be changed:
https://github.com/ViacomInc/data-point/blame/master/README.md#L2615
https://github.com/ViacomInc/data-point/blame/master/README.md#L2616
Change
res.send(result.value))
}))
to
res.send(result.value)
})
When a hash receives a non-hash type (array, number, string) it throws an error, the error comes out as:
Error Entity "hash:Foo" received a context.value = [object Object] that does not resolve to a plain object, this entity can only process Objects.
Fix the [object Object]
for stringified representation and output the type to help the user know the type he or she received.
Problem description:
The readme has a section on "Chained Reducers," but that's just another term for a Transform Expression, which is described in a different part of the readme (both terms refer to an array of reducers)
Suggested solution:
Remove the "chained reducers" section and let the transform expressions shine.
data-point
version: 1.xnode
version: N/Anpm
version: N/AProblem description:
To resolve a Path Reducer some logic must be applied. This logic is currently computed at the resolution cycle which is not the most efficient approach.
Suggested solution:
This moves the path reducer logic into the factory instead of the reducer.
Problem description:
It's not clear what methods/properties are available on the DataPoint object.
const DataPoint = require('data-point')
For example, DataPoint.createEntity
is used in a readme example, but that method is never documented.
Suggested solution:
Add this info to the readme
Refactor lib/index.js
to make it clear what's being exported. Currently it does this:
...and ./helpers
, which it's pointing to, does the same thing...
...but listing the exports by name would make these files more understandable:
// rough example...
module.exports = {
create: require('./core').create,
inspect: require('./helpers').inspect
// and so on
}
Allow reducer functions to be written as sync, async, and node style format
currently reducer functions are only allowed to accept the following format:
(acc, next) => {
next(err, value)
}
The Feature is to enable for the developer to write functions such as:
sync resolution
(acc) => {
return value
}
async resolution
(acc) => {
return Promise.resolve(value)
}
async await resolution
async (acc) => {
return await value
}
Advantages:
From the readme:
All entities share a common API (except for Transform)
A Transform entity is meant to be used as a 'snippet' entity that you can re-use in other entities. It does not expose the common before/after/error/params API that other entities have.
Transforms do expose the before
, after
, and error
callbacks though. Either the readme or the code should be updated for accuracy.
data-point
version: *node
version: N/Anpm
version: N/AAny relevant code:
All READMe.md files point to branch=ci, should point them to master
[![Build Status](https://travis-ci.org/ViacomInc/data-point.svg?branch=ci)](https://travis-ci.org/ViacomInc/data-point) [![Coverage Status](https://coveralls.io/repos/github/ViacomInc/data-point/badge.svg?branch=ci)](https://coveralls.io/github/ViacomInc/data-point?branch=ci)
What happened:
At times coverage badge shows 'unknown' since that branch does not exist.
Reproduction:
Go to: https://github.com/ViacomInc/data-point or any README under the packages folder
Problem description:
Points to ci branch but should point to master
Suggested solution:
update markdown
Problem description:
We define ReducerEntities with the <entityType>:<entityName>
pattern, but we have to reconstruct that ID when fetching entity definitions:
manager.entities.get(`${reducer.entityType}:${reducer.name}`)
Suggested solution:
Store the full <entityType>:<entityName>
ID on reducer entities.
Problem description:
TransformExpressions contain reducers, and some reducers contain TransformExpressions (which contain more reducers and so on). The API would be easier to describe if we just have reducers, which may or may not contain other reducers. Then everything is either an entity or a reducer, with no TransformExpression middleman. This would also simplify the code a bit, and entities could use single reducers without needing to wrap them in TransformExpressions.
Suggested solution:
Replace TransformExpressions with ReducerLists.
It would be nice to have a template for both issues and pull requests that facilitates the process of making new issues/PRs with all of the requisite information for review and consideration. This is particularly helpful when bugs are found as it reminds contributors what information is useful for reproducing issues, or when reviewing a pull request to be sure that the steps necessary to merge a pull request have been done.
The readme documents addEntityTypes
like this:
dataPoint.addEntityTypes(specs:Object)
...but it's actually used like this:
manager.addEntityTypes('transform', EntityTransform)
The readme also incorrectly states that the method returns a promise. Either the readme or the code should be updated for accuracy.
Problem description:
Entities are given before
, value
, after
, and error
transforms even when they're empty. This is unnecessary and adds clutter when trying to inspect the entity definitions.
// example input:
{
'entry:example': {
value: '$'
}
}
// turns into this when passed to createEntity:
{
id: 'entry:example',
value: {
reducers: [
{
type: 'ReducerPath',
name: '',
asCollection: false
}
],
typeOf: 'TransformExpression'
},
before: {
reducers: [],
typeOf: 'TransformExpression'
},
error: {
reducers: [],
typeOf: 'TransformExpression'
},
after: {
reducers: [],
typeOf: 'TransformExpression'
},
params: {}
}
Suggested solution:
Do not add empty transforms to entity definitions.
Problem description:
Each reducer type has a factory
file and a resolve
file, but they're in different directories. This makes it a little more difficult to find the code you need when updating a specific reducer.
Suggested solution:
The code for each reducer type should be kept in a single directory.
The value
parameter for DataPoint.transform
(which becomes acc.value
) should be attached to the accumulator without modification. Primitive values are currently wrapped in objects, which might lead to unexpected behavior. For example, this currently throws an error:
dataPoint
.transform((acc) => {
console.log(acc.value) // [String: 'test']
assert(acc.value === 'test') // ERROR
}, 'test')
The readme example for the control
entity currently does not work because of this behavior (it tries to compare a string against a string wrapped as an object).
The regular expression in this file could be improved:
It should match the pattern <EntityType>:<EntityId>, but these examples are false positives: aaa
, :aaa
, and a:
. The regex also fails to match some valid patterns, like a:a
.
Some entity names in this file also use hyphens, which suggests they should be valid, but the regex does not support them:
For example, the name transform:my-test-query
is valid, but the current regex treats it as transform:my
because it stops at the first hyphen.
Documentation in Request String Template has an error
https://github.com/ViacomInc/data-point#stringtemplate
Lines to be changed:
https://github.com/ViacomInc/data-point/blame/master/README.md#L1198
https://github.com/ViacomInc/data-point/blame/master/README.md#L1226
Change the occurences of:
transform('entry:getNodeOrgInfo', {
to
dataPoint.transform('request:getNodeOrgInfo', {
TransformMap
is currently used for hash
entities, but it should be supported as a reducer type (meaning it could be part of a TransformExpression
). This would make it easier to manipulate objects without creating full-blown hash entities. One main use case would be combining multiple requests into a single object:
{
'entry:example': [
{
a: 'request:a',
b: 'request:b'
},
// this transform could access the data from requests 'a' and 'b'
'transform:data'
]
}
to help with debugging attach entityId to the error object
Line where to add code
SPEC IS WORK IN PROGRESS
This feature would allow a reducer to resolve a transform at different levels. This means a transformation can be prematurely exited with a given value or it can be resolved to a new entity id.
API:
Resolve the current Transform with:
accumulator.resolveTransformWith(value:*)
accumulator.resolveTransformTo(entityId:string)
Implementation examples:
Using accumulator.resolveTransformWith
to resolve a transform with a given value.
dataPoint.transform([
()=> 'foo',
(acc) => acc.resolveTransformWith('bar')
(acc) => 'will not execute'
])
.then( acc => console.log(acc.value) )
// should output: 'bar'
Using accumulator.resolveTransformTo
to resolve a transform to another transform.
dataPoint.addEntities({
'transform:foo': () => 'foo',
'transform:bar': () => 'bar'
})
dataPoint.transform([
()=> 'useFoo',
(acc) => {
acc.value === 'useFoo'
? acc.resolveTransformTo('transform:foo')
: acc.resolveTransformTo('transform:bar')
}
])
.then( acc => console.log(acc.value) )
// should output: 'foo'
This feature introduces anonymous entities, which are just entities defined without names. Entities that do have names are referred to as declared entities.
DataPoint will expose a factory function for each registered entity type:
const dataPoint = require('data-point')
const factories = dataPoint.entityFactories
// {
// entry: [Function: create],
// request: [Function: create],
// hash: [Function: create],
// ...
// }
const requestEntity = factories.request({
url: 'https://api.github.com/orgs/nodejs'
})
dataPoint.transform(requestEntity)
This example shows the same transform with declared entities and anonymous entities:
declared:
dataPoint.addEntities({
'request:getOrgInfo': {
url: 'https://api.github.com/orgs/nodejs'
},
'hash:OrgInfo': {
mapKeys: {
reposUrl: '$repos_url',
eventsUrl: '$events_url'
}
}
})
dataPoint.transform('request:getOrgInfo | hash:OrgInfo')
anonymous:
const { request, hash } = dataPoint.entityFactories
dataPoint.transform([
request({
url: 'https://api.github.com/orgs/nodejs'
}),
hash({
mapKeys: {
reposUrl: '$repos_url',
eventsUrl: '$events_url'
}
})
])
These are main benefits I see from anonymous entities:
// data-hash.js
module.exports = hash({
mapKeys: {
reposUrl: '$repos_url',
eventsUrl: '$events_url'
}
})
// main.js
const dataHash = require('./data-hash')
dataPoint.transform([
request({
url: 'https://api.github.com/orgs/nodejs'
}),
dataHash
])
Any thoughts?
The readme says that collection
entities should go filter -> map -> find
, but they currently execute find
before map
, which isn't useful because map
expects an array but find
returns an element from an array.
Problem description:
Hash and collection entities don't always execute in the order they're defined. @YiningChen explains this problem in more depth here #24, but instead of warning users, I propose that we remove this behavior entirely. This would make it easier to read entity definitions without needing to consult the documentation to know the execution order for each of the keys.
Suggested solution:
If users want a hash or collection to use more than one modifier (e.g. addValues
, mapKeys
, etc), they should be required to use compose
.
For example, this would be valid:
dataPoint.addEntities({
'hash:example': {
addValues: {
b: { text: 'See You Later' }
}
}
})
...and this would be valid:
dataPoint.addEntities({
'hash:example': {
compose: [
addValues: {
b: { text: 'See You Later' }
},
mapKeys: {
a: '$a.text',
b: '$b.text',
}
]
}
})
...but this would NOT be valid, because addValues
and mapKeys
are both defined on the root object instead of using compose
:
dataPoint.addEntities({
'hash:example': {
addValues: {
b: { text: 'See You Later' }
},
mapKeys: {
a: '$a.text',
b: '$b.text',
}
}
})
To make users more cognizant of reducers' execution order, encourage them to list the reducers in execution order. For example, given this code:
const input = {
a: {
text: 'Hello World'
}
}
dataPoint.addEntities({
'hash:example': {
addValues: {
b: { text: 'See You Later' }
},
mapKeys: {
a: '$a.text',
b: '$b.text',
},
}
})
dataPoint.transform('hash:example', input)
A user might expect { a: 'Hello World', b: 'See You Later' }
But instead they'll get { a: 'Hello World', b: { text: 'See You Later' } }
Therefore a warning should be given when reducers aren't in execution order
To use |>
instead to |
to pipe reducers in TransformExpressions.
This would align with https://github.com/tc39/proposal-pipeline-operator proposal which will make it familiar to new people using the library how to expect this feature to work and also ... looks beautiful ๐ if you have a code font that uses ligatures.
The feature can be backward compatible, so for version 1.x it will support both |
and |>
and eventually remove single pipe in favor of the new operator.
Menlo font type: (no ligatures)
With ligatures (Firacode)
TransformMaps
, which are used by hash
entities, are objects where each value is a TransformExpression.
TransformObjects
, which are used by request
entities, are objects where keys that start with $
refer to TransformExpressions, but other keys refer to constants.
This is inconsistent, so we should combine these types to make a single ReducerObject
type (this is also related to #53).
In this proposed type, keys that start with $
refer to constants, but other keys refer to TransformExpressions.
{
'hash:example': {
mapKeys: {
a: 'hash:other', // resolves to the 'hash:other' entity
$a: 'hash:other', // resolves to the string 'hash:other'
}
}
}
This type would be used by hash
and request
entities, and it would mean that addValues
could be removed from the hash
API, because addKeys
would have the same functionality.
// in this example, addKeys and addValues are doing the same thing
{
'hash:example': {
addKeys: {
$a: 'hardcoded string'
},
addValues: {
a: 'hardcoded string'
}
}
}
The terms ComposeModifier
and ComposeReducer
are used inconsistently. For example, the documentation for hash
uses both of them when referring to the compose
property.
If a user misspells a reducer or tries to use a reducer that doesn't exist for that entity, give them a warning like:
error: maaapKeys is not a valid reducer for the hash entity, maybe try using: mapKey, addKeys....
or
error: mapKeys is not a valid reducer for collection entity, maybe try using: map reducer or hash entity
Problem description:
Schema entities do not validate the ajv schemas they're given.
Suggested solution:
When schema entities are created, they should call ajv's validateSchema method and throw with any errors that are generated.
This proposes a second "compilation" step when defining entities and entity types. DataPoint currently parses top-level entities when they're defined, but after finishing this work, the second step would examine and validate the contents of each entity. Having this step would prevent some runtime errors, and it would also make #40 possible.
In the following example, dataPoint.addEntities
does not cause an error, but dataPoint.transform
fails at runtime because transform:two
is not defined:
dataPoint.addEntities({
'transform:one': ['transform:two']
})
dataPoint.transform('transform:one', {})
With a second compilation step, the call to addEntities
would trigger a warning/error, and calling addEntityTypes
would perform similar checks.
New feature that would allow the developer to inspect a request before being sent
dataPoint.addEntities({
'request:<entityId>': {
params: {
inspect: Boolean|Function
}
}
})
If params.inspect
is true
it will output the entity's information to the console.
If params.inspect
is a function
, you may execute custom debugging code to be executed before the actual request gets made. The function receives the current accumulator value as its only parameter.
How should asCollection
reducers behave when the input is not an array?
Path reducers return null
, but entities resolve the value as if asCollection
had been false (and arguably, the entity behavior makes it possible to abuse the []
notation, because there's no drawback for putting that on every entity reference).
If a reducer expects an array but doesn't get one, a consistent approach would be logging an error but returning an empty array. We could also introduce a ?[]
notation that behaves the way that asCollection
entities behave now.
path reducer code:
https://github.com/ViacomInc/data-point/blob/master/packages/data-point/lib/resolve/reducer-path.js#L31
Problem description:
The functions for creating different entities are named inconsistently.
Examples
The schema function is EntitySchema
:
https://github.com/ViacomInc/data-point/blob/master/packages/data-point/lib/entity-types/entity-schema/factory.js#L10
The collection function is just Collection
:
https://github.com/ViacomInc/data-point/blob/master/packages/data-point/lib/entity-types/entity-collection/factory.js#L12
Suggested solution:
Rename these functions in lib/entity-types
to use the Entity<EntityType> naming convention that schema uses.
Getting error when running:
const entity = Helpers.createEntity(RenderTemplate, spec)
^
TypeError: Cannot read property 'createEntity' of undefined
Problem description:
Sometimes in reducer objects, we want to pass a value from the input object directly to the output object. Setting the value to true
could be a shorthand for doing this:
const reducer = {
a: {
x: true, // shorthand for writing '$a.x'
y: '$a.x',
z: true, // shorthand for writing '$a.z'
}
}
const input = {
a: {
x: 'X',
y: 'Y',
z: 'Z'
}
}
dataPoint.transform(reducer, input)
// result
{
a: {
x: 'X',
y: 'X',
z: 'Z',
}
}
This will be easier to read, especially with deeply nested properties.
When cloning and setting up this project, running npm install
with node v8.7.0
and npmv5.4.0
+ will cause a number of issues:
If instead yarn install
is run, then neither of the above issues are present. Maybe the installation instructions in the CONTRIBUTING.md can be copied to the main README to make it easier to notice that yarn
is strongly preferred?
Problem description:
It's hard to trace the functions that create and resolve entities.
The main resolve
function lives in lib/entity-types/resolve-entity/resolve-entity.js
, but the createEntity
function lives in lib/helpers/helpers.js
, which is not the first place you might look for such a crucial function:
https://github.com/ViacomInc/data-point/blob/master/packages/data-point/lib/helpers/helpers.js#L53
Suggested solution:
Move this code to a single entity-types/base-entity
directory.
Change Entity Source to Entity Request
This change is long overdue, the refactor is important since makes the entity more descriptive. this change should keep full backward compatibility with the source entity, we can remove source entity in the next mayor
When assigning an Invalid TransformExpression the error does not give enough information to know which is the Entity with the bad Transform.
Current Error message is:
Error: Transform [object Object] is not valid type (Array|String|Function)
The $.
path reducer is too easily confused with $..
, and it's redundant because $
does the same thing. It also follows a different pattern than $
and $..
:
'$..a'
accesses acc['a']
'$'
accesses acc.value
'$a.'
accesses acc.value
'$a'
accesses acc.value['a']
'$.a'
accesses acc.value['.a']
(it includes the .
in the path, which is not intuitive because $
and $.
are equivalent by themselves)Problem description:
lib/transform-expression/factory.js
and lib/reducer/factory.js
both validate the input for creating reducers, but that's redundant because input that's validated in the first file will later be validated in the second file. The logic in reducer/factory.js
is also better because it uses the isType
methods for each reducer type:
https://github.com/ViacomInc/data-point/blob/master/packages/data-point/lib/reducer/factory.js#L18
Suggested solution:
Remove the validate
and isValid
functions from lib/transform-expression/factory.js
, but add the line "try using an Array, String, Object, or Function
" to the error in lib/reducer/factory.js
:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.