Git Product home page Git Product logo

moonbox's Introduction

Moonbox - DVtaaS (Data Virtualization as a Service) Solution

Document

Introduction

Moonbox, designed based on the concept "Data Virtualization", is aimed at offering batch and interactive computing services. Moonbox hides the details and complexities of accessing data from the underlying data sources. Users can implement hybrid computation across disparate data systems and write out with SQL. In addition, Moonbox provides basic services like data service, data management, data tools, data development, etc., and it can make data application architecture and practice of logical data warehouse much more agile and flexible.

Features

  • Multi-tenant Supported
    Moonbox establishes a complete user architecture and introduces the concept of Organization for user space division. System Administrator can use ROOT account to create more than one Organizations and assign SA (super admin) (one or more) to these Organizations. SA creates and manages User. Moonbox abstracts 6 functionalities for User: whether it can execute Account statement, whether it can execute DDL statement, whether it can execute DCL statement, whether it can authorize other users to execute Account statement, whether it can authorize other users to execute DDL statement, and whether it can authorize other users to execute DCL statement. Free combinations of those functionalities build various user architecture models meeting multiple demands and implement multi-tenant.

  • Hybrid Calculation across Multiple Data Sources
    Taking Apache Spark as calculation engine, Moonbox supports hybrid calculation across multiple data sources, such as MySQL, Oracle, Hive, Kudu, HDFS, MongoDB, etc., and it also supports custom extension for more data sources.

  • Unified SQL Supported
    Spark SQL is the standard query language of Moonbox. With Spark SQL, specific DDL and DCL are expanded, including creating, deleting and authorizing users, access authorization for data table and data column, mount/umount of physical data source/table, creating or deleting logical database/time-scheduling event and udf/udaf, etc..

  • Optimization Strategy Supported
    Moonbox supports hybrid calculation based on Apache Spark, and Spark SQL supports multiple data sources. However, Spark SQL fails to utilize the calculation feature of data sources while pulling data, only focusing on the pushdown of project and filter (operators). Moonbox optimizes LogicalPlan that has been optimized by Spark Optimizer, splits subtree which can be pushed to data source, figures out the Data Source Query Language as the mapping of the subtree, and pulls the results back to Spark for further calculation. If the whole LogicalPlan can be pushed to data source, Moonbox will directly run the query statement (mapping of LogicalPlan) with data source, so as to reduce the cost of distributed obligation and save computing resource.

  • Column Permissions Control
    Moonbox defines DCL to implement column permission control. System Administrator authorizes data tables or columns to user with DCL, and Moonbox saves the permission relationship between user and tables/columns into catalog. While user executes SQL query, Moonbox will intercept the SQL and analyze whether it contains unauthorized tables/columns. If it does, Moonbox will report errors to users.

  • Diversified UDF/UDAF
    Moonbox supports creating UDF/UDAF not only with JAR files, but also with Source Code, including Java and Scala, making the development and verification of UDF more convenient.

  • Time-Scheduling Event Supported
    Moonbox provides time-scheduling event function. User defines time-scheduling event with DDL, defines scheduling strategy with crontab expression, and embeds quartz in the backend for time-scheduling event.

Latest Release

Please download the latest RELEASE

Get Help

Welcome to join our WeChat group "edpstack" for online discussion.

License

Please refer to LICENSE file.

moonbox's People

Contributors

borderlayout avatar exialin avatar lingliyuki avatar swallowjy avatar whtblktone avatar yandesty avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

moonbox's Issues

后台可视化管理

。元数据操作
。数据质量
。运行状态管理
。即席查询界面
。。。

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.