Git Product home page Git Product logo

terrier's Introduction

terrier

Build Status Jenkins Status codecov pullreminders

Frequently Asked Questions

  1. Is this Carnegie Mellon's new database system project that is replacing Peloton?

    Yes.

  2. Is the new name of the DBMS "terrier"?

    No. We have not announced the new name yet but it will not be "terrier". That is the name of Andy's dog.

  3. When will you announce the new system?

    We hope to have our first release in 2020.

  4. Will the new system still be "self-driving"?

    Yes, our goal is to have this new system to support autonomous operation and optimization as a first-class design principle.

  5. Will the new system still be PostgreSQL compatiable?

    Yes. The DBMS supports the PostgresSQL network protocol (both simple and extended) and emulates PostgresSQL's catalog layout. We are working on support for PL/pgSQL UDFs.

  6. How can I get involved?

    See the New Student Guide. If you are a current student at CMU, then you should consider enrolling in one of the database courses. Non-CMU students are also welcome to contribute.

terrier's People

Contributors

mbutrovich avatar tli2 avatar mush-zhang avatar rickyyx avatar shengxu1 avatar lmwnshn avatar apavlo avatar pervazea avatar gustavoangulo avatar jrolli avatar amlatyrngom avatar crd477 avatar tanujnay112 avatar ksaito7 avatar 17zhangw avatar wuwenw avatar tpan496 avatar harsh141994 avatar wenxuanqiu avatar ghatage avatar darkforte avatar thepinetree avatar gonzalezjo avatar yeshengm avatar db-ol avatar yuzeliao avatar swimj avatar venkatdatta avatar portablesounds avatar linmagit avatar

terrier's Issues

Alter SQL Support

TODOs:

  • At TrafficCop::BindQuery:
    • Implement the binder logic at : BindNodeVisitor::Visit(parser::AlterTableStatement)
  • At TrafficCop::OptimizeBoundQuery:
    • Implement operator trasnformer at : QueryToOperatorTransformer::Visit(AlterTableStatement)
    • Implement Logical Operator for Alter at : logical_operator.h
    • Implement Logical to Physical rules at: 'rule.cpp' and 'implementation_rules.cpp'
    • To generate a physical plan, implement physical operator at : physical_operator.h' for AlterTable`
    • Implement plan node for AlterTable at : alter_table_plan_node.h ?

    Maybe Should we break the AlterTable based on different commands? Like different AlterCmd should have different (sub) plan node? Needs to keep in mind there are tons of commands possible.

    • Implement PlanGenerator::Visit for AlterTable physical plan operator at plan_generator.cpp
  • At TrafficCop::ExecutePortal:
    • TrafficCop to execute the physical plan for AlteTable Query, so implement maybe ExecuteAlterStatement at postgres_network_commands.cpp:ExecutePortal
    • Implement the executors at DDLExecutors, AlterTableExectutor

Implement Add column first through the entire pipeline first. And then refactor from there.
Change column type might have to scan through the table, should be delayed.

Design dump

Data structures and high level design

Catalog

  • Catalog will include have a layout_versoin column for each table, that is queried by each transaction before calling the SqlTable API

SqlTable

  • SqlTable will now have multiple DataTable, each new DataTable created at the update of a new schema.
  • Those DataTable will be ordered by the schema version
  • Sqltable API will now have additional layout_version field that is used to find the corresponding DataTable.

DataTable

  • DataTable will have mappings and reverse mappings between col_id and col_oid for schema version alignment
  • DataTable will have reference to its SqlTable so that SlotIterator will know the next DataTable to iterate through at table boundary (since tuples now might exist in another DataTable)

Update the schema

  • Since we are only supporting safe schema updates, eg., Add / Drop Column , schemachange txn will not have conflict with non-schemachange txn.
  • Concurrent schemachange txn (write-write) conflict will be detected and syncronized at the catalog table.
  • Updating the schema only involves adding a new DataTable in the SqlTable, and the corresponding meta data (e.g., col_id maps)

Reading a tuple from SqlTable

  • If the tuple is in the correct layout_version, simply select the tuple and return
  • If the tuple is not in the correct layout_version, one can find the correct DataTable from the tupleslot, and then read the current tuple with intersection of the old and new schemas, and fill up the default columns.

Updating a tuple

  • If the tuple is in the correct layout_versoin, normal case
  • If the tuple is in an older layout_version, we will need to first logically delete the tuple from the old DataTable, and then insert that tuple in the current DataTable. The tuple in the old DataTable could still be accessed by a concurrent read Transaction

Background migration

  • When to migrate:
    • Same as GC, when all running txns have layout_version larger than the DataTable ?
  • How to migrate:
    • Background thread Delete/Insert one tuple at a time?

Edge cases sanity check

  1. What if concurrent Update and Read to a tuple(a1) in the old schema?
T1:               BEGIN                           READ in old -> chase version pointer-> a1
T2 :  BEGIN              DELETE a1 in old                                                                INSERT a2 in new -> ok 

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.