Git Product home page Git Product logo

malloy-py's Introduction

Malloy Logo

What is it?

Malloy is an experimental language for describing data relationships and transformations. It is both a semantic modeling language and a querying language that runs queries against a relational database. Malloy currently connects to BigQuery, and natively supports DuckDB. We've built a Visual Studio Code extension to facilitate building Malloy data models, querying and transforming data, and creating simple visualizations and dashboards.

Note: These APIs are still in development and are subject to change.

How do I get it?

Binary installers for the latest released version are available at the Python Package Index (PyPI).

python3 -m pip install malloy

Resources

Join The Community

  • Join our Malloy Slack Community! Use this community to ask questions, meet other Malloy users, and share ideas with one another.
  • Use GitHub issues to provide feedback, suggest improvements, report bugs, and start new discussions.

Syntax Examples

Run named query from malloy file:

import asyncio

import malloy
from malloy.data.duckdb import DuckDbConnection

async def main():
  home_dir = "/path/to/samples/duckdb/imdb"
  with malloy.Runtime() as runtime:
    runtime.add_connection(DuckDbConnection(home_dir=home_dir))

    data = await runtime.load_file(home_dir + "/5_movie_complex.malloy").run(
        named_query="horror_combo")

    dataframe = data.df()
    print(dataframe)


if __name__ == "__main__":
  asyncio.run(main())

Get SQL from inline query using malloy file as source:

import asyncio

import malloy
from malloy.data.duckdb import DuckDbConnection


async def main():
  home_dir = "/path/to/samples/duckdb/faa"
  with malloy.Runtime() as runtime:
    runtime.add_connection(DuckDbConnection(home_dir=home_dir))

    [sql, connection
    ] = await runtime.load_file(home_dir + "/flights.malloy").get_sql(query="""
                  query: flights -> {
                    where: carrier ? 'WN' | 'DL', dep_time ? @2002-03-03
                    group_by:
                      flight_date is dep_time.day
                      carrier
                    aggregate:
                      daily_flight_count is flight_count
                      aircraft.aircraft_count
                    nest: per_plane_data is {
                      top: 20
                      group_by: tail_num
                      aggregate: plane_flight_count is flight_count
                      nest: flight_legs is {
                        order_by: 2
                        group_by:
                          tail_num
                          dep_minute is dep_time.minute
                          origin_code
                          dest_code is destination_code
                          dep_delay
                          arr_delay
                      }
                    }
                }
            """)

    print(sql)


if __name__ == "__main__":
  asyncio.run(main())

Write inline malloy model source and run query:

import asyncio

import malloy
from malloy.data.duckdb import DuckDbConnection


async def main():
  home_dir = "/path/to/samples/duckdb/auto_recalls"
  with malloy.Runtime() as runtime:
    runtime.add_connection(DuckDbConnection(home_dir=home_dir))

    data = await runtime.load_source("""
        source: auto_recalls is table('duckdb:auto_recalls.csv') {
          declare:
            recall_count is count()
            percent_of_recalls is recall_count/all(recall_count)*100
        }
        """).run(query="""
        query: auto_recalls -> {
          group_by: Manufacturer
          aggregate:
            recall_count
            percent_of_recalls
        }
        """)

    dataframe = data.df()
    print(dataframe)


if __name__ == "__main__":
  asyncio.run(main())

Development

Initial setup

git submodule init
git submodule update
python3 -m pip install -r requirements.dev.txt
scripts/gen-services.sh

Regenerate Protobuf files

scripts/gen-protos.sh

Tests

python3 -m pytest

malloy-py's People

Contributors

abhillman avatar dependabot[bot] avatar gmaden avatar joeldodge79 avatar malloydataci avatar oscmage avatar srschandan avatar thejens avatar whscullin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.