Git Product home page Git Product logo

dataset's People

Contributors

alexgraul avatar iros avatar jugglinmike avatar makoto avatar rich-harris avatar rwaldron avatar tbranyen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dataset's Issues

Pivoting / normalizing a dataset

Takes a dataset like:

State, 2001, 2002
AZ     2     4
AL     6     7

And normalizes it like so:

ds.rowify( { "year" : ["2001", "2002"]});
State, Year, Value
AZ     2001  2
AZ     2002  4
AL     2001  6
AL     2002  7

Enumification

When there are repetitive string values in a column, replace them with an enum that maps to a lookup table.

Set should not be taking which rows to edit

It should be called as a .set on a dataset and ALL rows in that dataset should be set. This is because we would probably first be filering a dataset down and then setting a value. For example:

dataset.where({ budget : function(value) {
  return value > 100000;
}).set({ huge : true });

Sync :: Investigate pulling synchronization into a mixin

At the moment we are passing in { sync : true } to various method to allow one to subscribe to changes on it. This is currently somewhat inconsistant (you can do this on products but not on direct value extractions) and somewhat bulky when you don't want to keep any updates on your data (which unless you have a live data feed, you don't.)

Look into pulling out the sync behavior into a separate mixin that will enable binding to various dataset extraction methods.

Add a way of creating groupings

Allow for a way to take our normalized & disagregated data and turn it into subset groupings

For example, in a dataset like:

Race  Gender  Wage
W F 100
W M 200
B F 100
B M 200

The following aggregates will be created:

Race  Gender  Wage
W F 100
W M 200

Race  Gender  Wage
B F 100
B M 200

Race  Gender  Wage
B F 100
W F 200

Race  Gender  Wage
B M 100
W M 200

Race  Gender  Wage
* F 200

Race  Gender  Wage
* M 400

Race  Gender  Wage
W * 300

Race  Gender  Wage
B * 300

Race  Gender  Wage
* * 600

Potential syntax:

ds.groupings(
  [
    {name: 'Race', aggregate : true},  
    {name: 'Gender', aggregate : true} 
  ]
);

OR

ds.groupings(['Race', 'Gender'],  {aggregate : true});

Add subset functionality

On initialization of dataset, specify custom subsets that are cached by subset name + passed arguments:

data = new Dataset({
  url : "http://bla.json",
  subsets : {
    a_state : function(state) {
      return this.filter(function(row) {
        return row.get("state") === state;
      });
    },
    a_state_by_hour : function(state) {
      return this.subset("a_state", state).groupBy("hour")
    },
    a_state_by_type : function(state) {
      return this.subset("a_state", state).groupBy("type")
    }
  }
})

Add type support (initially for Time)

Add support for explicitly setting a column type on import and coercing rows to that type. Add support for a time type which is coerced into moment.js

Add where selection

// to get all rows where a column value == value;
dataset.where({
  propName : "value" 
});

// OR
// to get all rows where a column value passes some condition
dataset.where({
  propName : function(value) { return value > 5; }
});

// You can do this with multiple properties, it's an AND right?
dataset.where({
  propName : function(value) { return value > 5; }
  otherPropName : function(value) { return value < 5; }
});

// you can overwrite it being an and, by passing a diff operand in options
dataset.where({
  propName : function(value) { return value > 5; }
  otherPropName : function(value) { return value < 5; }
}, { operand: function(res1, res2) {
  return res1 || res2; 
});

Add groupBy support

Allows one to group the values in a column based on duplicate values in another. For example:

State Val1  Val2  
AZ  5 6
AZ  10  10

Calling the following

dataset.groupBy("state", ["Val1"])

Which would return:

State Val1
AZ 15

add an addColumn method

// adds empty column
dataset.addColumn("name");

// adds a column with data
dataset.addColumn("name", [1,2,3,4,5]);

// another way to add data to a column
dataset.addColumn("name").each(function(row) {
  row.name = 12;
});

Should return reference to dataset so we can further chain awesomeness.

Objectify numeric value returns

Previously calling

dataset.max('budget') 

returned a number.

Now any numeric derived values should return an object that has two methods:

.val()
.bind()

.bind will still bind to the original dataset to all rows that are relevant to the computation.

add cached view support

We noticed many a times we need a particular subset of the data for a component (like all rows pertinent to a state from a dataset that has data to all the states.) Because a component SHOULD be agnostic to its context, it SHOULDN'T be responsible for FIRST filtering the data and THEN extracting the values it needs.

Add the following functionality:

  1. Define views on a dataset
var ds = new Dataset({ 
  views : {
    city : function(city) {
      return this.where({ city : city });
    }
  }
}

Also, add a way to tell the parent dataset to update its view. It should take in the params required by the view function.

ds.updateView('city', cityname);

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.