Git Product home page Git Product logo

yocto-db's Introduction

yocto-db

A data-stream management system. Build Status

Build & Run

rebar clean compile
erl -pz ebin deps/*/ebin

Test

rebar eunit

Documentation

rebar doc

Supervisor Hierarchy

+ydb_sup
|
+---+ydb_input_stream_sup
    |
    +---+ydb_input_stream
    |   |
    |   +---+ydb_{file,socket}_input
    |       |
    |       +ydb_branch_node
    |
    +ydb_query_sup
    |
    +---+ydb_query

The ydb_input_stream_sup module supports registering input streams either from a file or a socket. The input stream process also spawns a branch node process and adds it as a subscriber.

IO Definitions

File Input

{ydb_file_input,
    child_node
  , filename :: string()
  , batch_size :: integer()
  , poke_freq :: integer()
}

File Output

{ydb_file_output, child_node, filename :: string()}

Socket Input

{ydb_socket_input,
    child_node
  , port_no :: integer()
  , socket :: port()
  , acceptor :: pid()
}

Socket Output

{ydb_socket_output,
    child_node
  , port_no :: integer()
  , socket :: port()
  , address :: term()
}

Query Definitions

Register a timer that pokes the intermediary plan nodes at each tick, and store the reference in the supervisor state. This keeps the plan nodes up to date on the current time so they how to store partial results.

Select

{ydb_select, child_node, predicate :: ydb_clause()}

Project

The column specification is either just the column name atom() or a tuple containing the current column name and the new name for the column {atom(), atom()}.

{ydb_project, child_node, columns :: [atom() | {atom(), atom()}]}

Joins

{ydb_join_node, left_node, right_node, predicate :: ydb_clause()}

Set Union

{ydb_union_node, left_node, right_node}

Set Difference

{ydb_diff_node, left_node, right_node}

Min

{ydb_min, child_node, column :: atom() | {atom(), atom()}}

Max

{ydb_max, child_node, column :: atom() | {atom(), atom()}}

Sum

{ydb_sum, child_node, column :: atom() | {atom(), atom()}}

Count

{ydb_count, child_node, column :: atom() | {atom(), atom()}}

Average

{ydb_avg, child_node, column :: atom() | {atom(), atom()}}

Variance

{ydb_var, child_node, column :: atom() | {atom(), atom()}}

Standard Deviation

[ydb_stddev, child_node, column :: atom() | {atom(), atom()}}

Predicate Format

Boolean Operators

{ydb_and, clauses :: [ydb_clause()]}
{ydb_or, clauses :: [ydb_clause()]}
{ydb_not, clause :: ydb_clause()}

ydb_clause() :: #ydb_cv{} | #ydb_cc{} | #ydb_and{} | #ydb_or{} | #ydb_not{}

Comparison Operators

Comparing a column to a value.

{ydb_cv,
    column :: atom()
  , operator :: compare()
  , value :: term()
}

Comparing two columns together.

{ydb_cc,
    left_col :: atom()
  , operator :: compare()
  , right_col :: atom()
}

Comparison operators.

compare() :: 'gt'  | 'lt'  | 'eq'
           | 'lte' | 'gte' | 'ne'

Table Formats

Rows in ETS tables are represented as tuples. All the rows in a given ETS table should be in the same format, which will be one of the following:

Relation Table

ydb_rel_tuple() :: {
    {'row_num', RowNum :: non_neg_integer()}
  , Tuple :: ydb_tuple()
}.

RowNum serves as a unique id for each row.

Synopsis Table

ydb_syn_tuple() :: {
    {Op :: atom(), Timestamp :: non_neg_integer()}
  , Tuple :: ydb_tuple()
}.

Op is the name of the aggregate (e.g. 'sum' or 'count').

Diff Table

ydb_diff_tuple() :: {
    {Diff :: diff(), Op :: atom(), Timestamp :: non_neg_integer()}
  , Tuple :: ydb_tuple()
}.

Op is the name of the aggregate, as above (e.g. 'sum' or count'). Diff indicates whether the tuple is to be inserted or deleted.

diff() :: '+' | '-'.

yocto-db's People

Contributors

anjoola avatar ksuraesh avatar

Stargazers

 avatar Conail Stewart avatar Hamid Ebadi avatar Andrey Pavlov avatar Ali Sabil avatar

Watchers

Max Hirschhorn avatar  avatar  avatar Rongjie Zhang avatar

Forkers

ebadi

yocto-db's Issues

Avoid nonsense with forward/backward joins

Keep track of the two relations and perform the following operations.

  1. Perform a join between the received diff and the stored relation (yielding a diff)
  2. Emit this diff to all subscribers
  3. Apply the received diff to the (other) stored relation

Add support for state (synopsis) recovery

Consider having another worker for each query that is responsible for inheriting the ETS table upon the death of another process, and then returns it once that process has been restarted.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.