Git Product home page Git Product logo

wooby's Introduction

Wooby

My take on a database, without prior knowledge or guidance on the subject. I'll keep this very short so basically I'm doing it in C# .NET for practicing my skills with the ecosystem, and this is not meant to be taken seriously, to be performing, feature complete or even accurate to any standards at the moment, just a funny side project to see how hard it is to get to a somewhat barebones database.

I won't try to mimic an existing RDBMS, nor implement everything different from everyone else. But it probably leans towards Oracle the most, followed by SQLite, which are the ones I have the most experience with.

It's currently more of a SQL interpreter than a database, though I intend to focus more on the storage and querying performance side of things as soon as I have implemented enough SQL features to test those with.

Expanding a bit on the first paragraph, I also intent to test the feasibility or even possibility of implementing a few things that cross or have crossed my mind these past years. Moments I thought (specially during a time I heavily dealt with and maintained a somewhat large codebase of Oracle PL/SQL) "what if this could be done entirely in the database instead?". Probably the opposite of what most people, and myself in other occasions (when not entirely out of amusement for the idea), think.

Removed the roadmap. Instead I'll list below what's already "working" and what I intend to implement in the future

unlikely = low priority, maybe never

  • SELECT
    • Basic column name, wildcard or expression
    • Alias for each expression
    • DISTINCT
    • COUNT
      • Basic COUNT support, without any arguments, counts all rows in the resulting query
      • COUNT(expr) Number of rows where expr is not NULL (can be achieved with WHERE expr IS NOT NULL)
      • COUNT(DISTINCT expr) same as above but count only distinct values of expr (can be achieved via existing group by clause)
      • COUNT(*) syntactic sugar, does the same as Basic COUNT above
      • (unlikely) OVER () / PARTITION
    • UNION clause
      • UNION ALL
      • UNION (filtering so that union'ed rows are not duplicate in the output)
    • Better semantic analysis for Boolean expressions (ensure expression is valid)
    • Single value sub select as expression
    • Expandable type system
      • Custom conversion between types
    • More basic types
    • Type promotion/casting (e.g. CURRENT_DATE() - 30)
  • INSERT
    • Basic INSERT support with columns and values list
    • Batch insert (VALUES (...), (...), ...)
    • INSERT from select (INSERT INTO x SELECT ... FROM y)
  • UPDATE
    • Basic update support with SET list of columns
  • DELETE
    • Basic DELETE support
  • Data definition and relationships
    • Basic CREATE
    • DROP TABLE
    • ALTER TABLE
    • Primary keys
    • Foreign keys
    • Constraints
      • UNIQUE
      • NOT NULL
      • CHECK
      • ON DELETE clauses using foreign keys
    • Indices
  • Functions
    • Basic functions
    • Aggregate functions
    • "generic" functions (currently they all accept a certain type as parameter and return another. Need to overload for each possible combination)
    • (somewhat) complete set of standard functions (string and list manipulation, type casting, etc)
  • FROM clause
    • FROM table name
    • FROM sub select
  • WITH clause and temp tables
    • (unlikely) recursive CTE
  • WHERE clause
    • IS / IS NOT for NULL comparison (currently works with = and !=/<>)
  • GROUP BY expression
  • ORDER BY expression
  • JOIN
    • LEFT JOIN
    • INNER JOIN
    • RIGHT JOIN
    • OUTER JOIN
    • (unlikely) CROSS and SELF JOIN
  • HAVING clause
  • Meta tables with info about the schema and database
  • Explain/calculate an execution plan
  • Data persistence to the disk and caching
    • Save query data to the disk as it's being read so that we can execute queries too big to fit in memory
    • Binary model to save table data to the disk, with fast indexing and lookup
  • Import CSV files into tables from the command line
  • Command line that doesn't suck
  • System.Data interfaces implementation
  • GUI query runner and schema browser
  • Unhelpful and vague error messages

And certainly more things I don't remember at the moment, or haven't thought of yet.

wooby's People

Contributors

lazpeng avatar

Stargazers

 avatar

Watchers

 avatar  avatar

wooby's Issues

WHERE clause

Fix WHERE cause parsing and compilation, which is deeply flawed and may not work properly in most cases

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.