Git Product home page Git Product logo

moz-sql-parser's Introduction

Moz SQL Parser

Let's make a SQL parser so we can provide a familiar interface to non-sql datastores!

Branch Status
master Build Status
dev Build Status

Problem Statement

SQL is a familiar language used to access databases. Although, each database vendor has its quirky implementation, there is enough standardization that the average developer does not need to know of those quirks. This familiar core SQL (lowest common denominator, if you will) is useful enough to explore data in primitive ways. It is hoped that, once programmers have reviewed a datastore with basic SQL queries, and they see the value of that data, and they will be motivated to use the datastore's native query format.

Objectives

The primary objective of this library is to convert SQL queries to JSON-izable parse trees. This originally targeted MySQL, but has grown to include other database vendors. Please paste some SQL into a new issue if it does not work for you

Non-Objectives

  • No plans to provide update statements, like update or insert
  • No plans to provide data access tools

It is my sincere hope you can convert the JSON into queries for your particular backend datastore

Project Status

Jan 2021 -There are almost 500 tests. This parser is good enough for basic usage, including inner queries, with clauses, and window functions. There is still a lot missing to support BigQuery and Redshift queries.

Install

pip install moz-sql-parser

Parsing SQL

>>> from moz_sql_parser import parse
>>> import json
>>> json.dumps(parse("select count(1) from jobs"))
'{"select": {"value": {"count": 1}}, "from": "jobs"}'

Each SQL query is parsed to an object: Each clause is assigned to an object property of the same name.

>>> json.dumps(parse("select a as hello, b as world from jobs"))
'{"select": [{"value": "a", "name": "hello"}, {"value": "b", "name": "world"}], "from": "jobs"}'

The SELECT clause is an array of objects containing name and value properties.

Recursion Limit

Python's default recursion limit (1000) is not hit when parsing the test suite, but this may not be the case for large SQL. You can increase the recursion limit before you parse:

>>> from moz_sql_parser import parse
>>> sys.setrecursionlimit(3000)
>>> parse(complicated_sql)

Generating SQL

You may also generate SQL from the a given JSON document. This is done by the formatter, which is still incomplete (Jan2020).

>>> from moz_sql_parser import format
>>> format({"from":"test", "select":["a.b", "c"]})
'SELECT a.b, c FROM test'

Contributing

In the event that the parser is not working for you, you can help make this better but simply pasting your sql (or JSON) into a new issue. Extra points if you describe the problem. Even more points if you submit a PR with a test. If you also submit a fix, then you also have my gratitude.

Run Tests

See the tests directory for instructions running tests, or writing new ones.

More about implementation

SQL queries are translated to JSON objects: Each clause is assigned to an object property of the same name.

# SELECT * FROM dual WHERE a>b ORDER BY a+b
{
    "select": "*", 
    "from": "dual", 
    "where": {"gt": ["a", "b"]}, 
    "orderby": {"value": {"add": ["a", "b"]}}
}

Expressions are also objects, but with only one property: The name of the operation, and the value holding (an array of) parameters for that operation.

{op: parameters}

and you can see this pattern in the previous example:

{"gt": ["a","b"]}

Array Programming

The moz-sql-parser.scrub() method is used liberally throughout the code, and it "simplifies" the JSON. You may find this form a bit tedious to work with because the JSON property values can be values, lists of values, or missing. Please consider converting everything to arrays:

def listwrap(value):
    if value is None:
        return []
    elif isinstance(value, list)
        return value
    else:
        return [value]

then you may avoid all the is-it-a-list checks :

for select in listwrap(parsed_result.get('select')):
    do_something(select)

you may find it easier if all JSON expressions had a list of operands:

def normalize(expression)
    # ensure parameters are in a list
    return {
        op: params
        for op, param = expression.items()
        for params in [[normalize(p) for p in listwrap(param)]]
    }

moz-sql-parser's People

Contributors

klahnakoski avatar todpole3 avatar nishikeshkardak avatar pydolan avatar johnatannvmd avatar betodealmeida avatar alberto15romero avatar mknorps avatar thrbowl avatar amolk avatar sonalisinghal avatar tiberiuichim avatar dominikwin avatar ealter avatar mozilla-github-standards avatar diggzhang avatar miketzian avatar sam-smo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.