Deprecation note:
Pilosa was rebranded to FeatureBase, which comes with its own SQL support.
Tools like Looker can solve the problem of joining tables from different sources.
This project is no longer actively maintained.

Calcite Pilosa Adapter

Plugin for Apache Calcite to query Pilosa distributed index using SQL.

General information

Pilosa is a high-performance distributed bitmap index. It has been successfully used in many data-intensive projects. One of the notable applications is the facts table in the analytical database. As the core of the analytical database, Pilosa solves the computational problem efficiently. Calcite Pilosa Adapter solves the problem of linking the Pilosa table with supplementary dimension tables from other databases, like Postgres, making Pilosa a powerful exploration tool for business intelligence.

How it works?

The Calcite Pilosa adapter works as a proxy. Clients connect with JDBC-compatible drivers and query the data with SQL, adapter translates queries into Pilosa Query language (PQL), and then translates results into JDBC ResultSet to send them back to the client.

Installation

<dependency>
  <groupId>com.alexrnv.calcite.adapter.pilosa</groupId>
  <artifactId>calcite-pilosa</artifactId>
  <version>0.0.1</version>
</dependency>

^{note the artifact is hosted in GitHub Packages}

Usage

1.) Start from the configuration:

{
  "version": "1.0",
  "defaultSchema": "pilosa",
  "schemas": [
    {
      "name": "pilosa-cluster",
      "type": "custom",
      "factory": "com.alexrnv.calcite.adapter.pilosa.model.PilosaSchemaFactory",
      "operand": {
        "url": "http://localhost:10101"
      }
    }
  ]
}

Provide your Pilosa server endpoint.

2.) The following code snippet starts a JDBC server inside your service.

LocalService service = new PilosaServiceFactory(modelFileUri).createLocalService();
HttpServer server = new PilosaHttpServerFactory(service, serverPort).createHttpServer();
server.start();
try {
    server.join();
} catch (InterruptedException e) {
    Thread.currentThread().interrupt();
} finally {
    server.stop();
}

Your service is now listening for JDBC connections.

3.) Connect from your favourite JDBC client.
Clients should use Avatica JDBC driver (mvn). Use version 1.14.0 or later.

jdbc:avatica:remote:url=http://<host>:<port>>/sql/v1

4.) Run your analysis with SQL

select 
count(distinct facts._id)
from 
pilosa.facts_table facts
join
postgres.dimension_table dimension
on dimension.X = facts.X
where 
dimension.Y = [value]

Please, check wiki for examples and available options.

alex-rnv / calcite-pilosa Goto Github PK

calcite-pilosa's Introduction

Calcite Pilosa Adapter

General information

How it works?

Installation

Usage

calcite-pilosa's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent