Git Product home page Git Product logo

quicksql's Introduction

English|中文

200_200

Language Release Version license Documentation Status PRs Welcome

Quicksql is a SQL query product which can be used for specific datastore queries or multiple datastores correlated queries. It supports relational databases, non-relational databases and even datastore which does not support SQL (such as Elasticsearch, Druid) . In addition, a SQL query can join or union data from multiple datastores in Quicksql. For example, you can perform unified SQL query on one situation that a part of data stored on Elasticsearch, but the other part of data stored on Hive. The most important is that QSQL is not dependent on any intermediate compute engine, users only need to focus on data and unified SQL grammar to finished statistics and analysis.

Star-History

Architecture

An architecture diagram helps you access Quicksql more easily.

1540973404791

QSQL architecture consists of three layers:

  • Parsing Layer: Used for parsing, validation, optimization of SQL statements, splitting of mixed SQL and finally generating Query Plan;

  • Computing Layer: For routing query plan to a specific execution plan, then interpreted to executable code for given storage or engine(such as Elasticsearch JSON query or Hive HQL);

  • Storage Layer: For data prepared extraction and storage;

Basic Features

In the vast majority of cases, we expect to use a language for data analysis and don't want to consider things that are not related to data analysis, Quicksql is born for this.

The goal of Quicksql is to provide three functions:

1. Unify all structured data queries into a SQL grammar

  • Only Use SQL

In Quicksql, you can query Elasticsearch like this:

SELECT state, pop FROM geo_mapping WHERE state = 'CA' ORDER BY state

Even an aggregation query:

SELECT approx_count_distinct(city), state FROM geo_mapping GROUP BY state LIMIT 10

You won't be annoyed again because the brackets in the JSON query can't match ;)

  • Eliminate Dialects

In the past, the same semantic statement needs to be converted to a dialect for different engines, such as:

SELECT * FROM geo_mapping                       -- MySQL Dialect
LIMIT 10 OFFSET 10                              
SELECT * FROM geo_mapping                       -- Oracle Dialect
OFFSET 10 ROWS FETCH NEXT 10 ROWS ONLY          

In Quicksql, relational databases no longer have the concept of dialects. You can use the grammar of Quicksql to query any engine, just like this:

SELECT * FROM geo_mapping LIMIT 10 OFFSET 10    -- Run Anywhere

2. Shield the isolation between different data sources

Consider a situation where you want to join tables that are in different engines or are not in the same cluster, you may be in trouble.

However, in Quicksql, you can query like this:

SELECT * FROM 
    (SELECT * FROM es_raw.profile AS profile    //index.tpye on Elasticsearch 
        WHERE note IS NOT NULL )AS es_profile
INNER JOIN 
    (SELECT * FROM hive_db.employee AS emp  //database.table on Hive
    INNER JOIN hive_db.action AS act    //database.table on Hive
    ON emp.name = act.name) AS tmp 
ON es_profile.prefer = tmp.prefer

3. Choose the most appropriate way to execute the query

A query involving multiple engines can be executed in a variety of ways. Quicksql wants to combine the advantages of each engine to find the most appropriate one.

Getting Started

For instructions on building Quicksql from source, see Getting Started.

Reporting Issues

If you find any bugs or have any better suggestions, please file a GitHub issue.

And if the issue is approved, a label [QSQL-ID] will be added before the issue description by committer so that it can correspond to commit. Such as:

[QSQL-1002]: Views generated after splitting logical plan are redundant.

Contributing

We welcome contributions.

If you are interested in Quicksql, you can download the source code from GitHub and execute the following maven command at the project root directory:

mvn -DskipTests clean package

If you are planning to make a large contribution, talk to us first! It helps to agree on the general approach. Log a Issures on GitHub for your proposed feature.

Fork the GitHub repository, and create a branch for your feature.

Develop your feature and test cases, and make sure that mvn install succeeds. (Run extra tests if your change warrants it.)

Commit your change to your branch.

If your change had multiple commits, use git rebase -i master to squash them into a single commit, and to bring your code up to date with the latest on the main line.

Then push your commit(s) to GitHub, and create a pull request from your branch to the QSQL master branch. Update the JIRA case to reference your pull request, and a committer will review your changes.

The pull request may need to be updated (after its submission) for two main reasons:

  1. you identified a problem after the submission of the pull request;
  2. the reviewer requested further changes;

In order to update the pull request, you need to commit the changes in your branch and then push the commit(s) to GitHub. You are encouraged to use regular (non-rebased) commits on top of previously existing ones.

Join us

Slack Github QQ

quicksql's People

Contributors

bebetter avatar dependabot[bot] avatar fangyuefy avatar foster9527 avatar francis-du avatar functor10 avatar ganfengtan avatar james601232 avatar julian-cn avatar sarasara100 avatar shaofengshi avatar zhanghuang03 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

quicksql's Issues

按照集群环境部署向导无法启动测试环境

配置好了
Java >= 1.8
Spark >= 2.2
的环境,下载了官方的release包,也配置好了base-env.sh的环境变量,但是每次启动都报下面的包冲突错误“Incompatible Jackson version: 2.10.0-pr1”,但2.10.0-pr1,自己构建排除了jackson-databind其他version的间接依赖还是不行,如下:

sh bin/run-example com.qihoo.qsql.CsvJoinWithEsExample
ERROR StatusLogger No Log4j 2 configuration file found. Using default configuration (logging only errors to the console), or user programmatically provided configurations. Set system property 'log4j2.debug' to show Log4j 2 internal initialization logging. See https://logging.apache.org/log4j/2.x/manual/configuration.html for instructions on how to configure Log4j 2
Elasticsearch Embedded Server is starting up, waiting....
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/usr/local/qsql-0.6/lib/slf4j-log4j12-1.7.13.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/usr/local/spark-2.3.3-bin-hadoop2.7/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Elasticsearch Embedded Server has started!! Your query is running...
Input: SELECT * FROM depts INNER JOIN (SELECT * FROM student WHERE city in ('FRAMINGHAM', 'BROCKTON', 'CONCORD')) FILTERED ON depts.name = FILTERED.type
2019-08-29 10:00:43,446 [main] INFO  - The SQL that is ready to execute is:
SELECT * FROM depts INNER JOIN (SELECT * FROM student WHERE city in ('FRAMINGHAM', 'BROCKTON', 'CONCORD')) FILTERED ON depts.name = FILTERED.type
2019-08-29 10:00:43,896 [main] INFO  - Read schema from manual schema, schema or path is: inline:
{
  "version": "1.0",
  "defaultSchema": "QSql",
  "schemas": [{
      "type": "custom",
      "name": "custom_name",
      "factory": "org.apache.calcite.adapter.csv.CsvSchemaFactory",
      "operand": {
        "directory": ""
      },
      "tables": [{
        "name": "depts",
        "type": "custom",
        "factory": "org.apache.calcite.adapter.csv.CsvTableFactory",
        "operand": {
          "file": "/usr/local/qsql-0.6/data/sales/DEPTS.csv",
          "flavor": "scannable"
        },
        "columns": [{
            "name": "deptno:int"
          },
          {
            "name": "name:string"
          }
        ]
      }]
    },
    {
      "type": "custom",
      "name": "student_profile",
      "factory": "org.apache.calcite.adapter.elasticsearch.ElasticsearchCustomSchemaFactory",
      "operand": {
        "coordinates": "{'localhost': 9025}",
        "userConfig": "{'bulk.flush.max.actions': 10, 'bulk.flush.max.size.mb': 1,'esUser':'username','esPass':'password'}",
        "index": "student"
      },
      "tables": [{
        "name": "student",
        "factory": "org.apache.calcite.adapter.elasticsearch.ElasticsearchTableFactory",
        "operand": {
          "dbName": "student_profile",
          "tableName": "student",
          "esNodes": "localhost",
          "esPort": "9025",
          "esUser": "username",
          "esPass": "password",
          "esScrollNum": "246",
          "esIndex": "student"
        },
        "columns": [{
            "name": "city:string"
          },
          {
            "name": "province:string"
          },
          {
            "name": "digest:int"
          },
          {
            "name": "type:string"
          },
          {
            "name": "stu_id:string"
          }
        ]
      }]
    }
  ]
}
2019-08-29 10:00:45,749 [main] INFO  - Running Spark version 2.3.3
2019-08-29 10:00:48,329 [main] WARN  - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2019-08-29 10:00:48,420 [main] INFO  - Submitted application: spark-mixed-app
2019-08-29 10:00:48,537 [main] INFO  - Changing view acls to: hadu
2019-08-29 10:00:48,538 [main] INFO  - Changing modify acls to: hadu
2019-08-29 10:00:48,539 [main] INFO  - Changing view acls groups to:
2019-08-29 10:00:48,540 [main] INFO  - Changing modify acls groups to:
2019-08-29 10:00:48,541 [main] INFO  - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(hadu); groups with view permissions: Set(); users  with modify permissions: Set(hadu); groups with modify permissions: Set()
2019-08-29 10:00:48,648 [main] INFO  - Successfully started service 'sparkDriver' on port 55630.
2019-08-29 10:00:48,687 [main] INFO  - Registering MapOutputTracker
2019-08-29 10:00:48,724 [main] INFO  - Registering BlockManagerMaster
2019-08-29 10:00:48,729 [main] INFO  - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2019-08-29 10:00:48,730 [main] INFO  - BlockManagerMasterEndpoint up
2019-08-29 10:00:48,745 [main] INFO  - Created local directory at /private/var/folders/92/nxg69z853sn_tpz_pnlgfnpr0000gn/T/blockmgr-0489fdf5-83a5-49ce-90f7-856d6d1b5872
2019-08-29 10:00:48,782 [main] INFO  - MemoryStore started with capacity 2004.6 MB
2019-08-29 10:00:48,804 [main] INFO  - Registering OutputCommitCoordinator
2019-08-29 10:00:48,923 [main] INFO  - Logging initialized @13567ms
2019-08-29 10:00:49,015 [main] INFO  - jetty-9.3.z-SNAPSHOT, build timestamp: unknown, git hash: unknown
2019-08-29 10:00:49,038 [main] INFO  - Started @13683ms
2019-08-29 10:00:49,072 [main] INFO  - Started ServerConnector@563392e5{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2019-08-29 10:00:49,072 [main] INFO  - Successfully started service 'SparkUI' on port 4040.
2019-08-29 10:00:49,137 [main] INFO  - Started o.s.j.s.ServletContextHandler@22ee1ad7{/jobs,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,138 [main] INFO  - Started o.s.j.s.ServletContextHandler@4d793390{/jobs/json,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,138 [main] INFO  - Started o.s.j.s.ServletContextHandler@3a359f7c{/jobs/job,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,139 [main] INFO  - Started o.s.j.s.ServletContextHandler@279e8bc0{/jobs/job/json,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,140 [main] INFO  - Started o.s.j.s.ServletContextHandler@23ffc910{/stages,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,140 [main] INFO  - Started o.s.j.s.ServletContextHandler@35277c6c{/stages/json,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,141 [main] INFO  - Started o.s.j.s.ServletContextHandler@7a364e1c{/stages/stage,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,142 [main] INFO  - Started o.s.j.s.ServletContextHandler@7a053795{/stages/stage/json,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,142 [main] INFO  - Started o.s.j.s.ServletContextHandler@328bc067{/stages/pool,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,143 [main] INFO  - Started o.s.j.s.ServletContextHandler@337fb1a5{/stages/pool/json,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,145 [main] INFO  - Started o.s.j.s.ServletContextHandler@38b0e2a7{/storage,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,145 [main] INFO  - Started o.s.j.s.ServletContextHandler@6bdad3bb{/storage/json,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,146 [main] INFO  - Started o.s.j.s.ServletContextHandler@73eae5f{/storage/rdd,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,146 [main] INFO  - Started o.s.j.s.ServletContextHandler@4902c584{/storage/rdd/json,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,147 [main] INFO  - Started o.s.j.s.ServletContextHandler@7698a3d9{/environment,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,147 [main] INFO  - Started o.s.j.s.ServletContextHandler@4b62f1ba{/environment/json,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,148 [main] INFO  - Started o.s.j.s.ServletContextHandler@39dce2df{/executors,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,148 [main] INFO  - Started o.s.j.s.ServletContextHandler@662d3e85{/executors/json,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,149 [main] INFO  - Started o.s.j.s.ServletContextHandler@5598dff2{/executors/threadDump,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,149 [main] INFO  - Started o.s.j.s.ServletContextHandler@92b1bda{/executors/threadDump/json,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,158 [main] INFO  - Started o.s.j.s.ServletContextHandler@57bfca3a{/static,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,159 [main] INFO  - Started o.s.j.s.ServletContextHandler@56d6e2e1{/,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,160 [main] INFO  - Started o.s.j.s.ServletContextHandler@4e9695cf{/api,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,161 [main] INFO  - Started o.s.j.s.ServletContextHandler@178ebac3{/jobs/job/kill,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,161 [main] INFO  - Started o.s.j.s.ServletContextHandler@17063c32{/stages/stage/kill,null,AVAILABLE,@Spark}
2019-08-29 10:00:49,164 [main] INFO  - Bound SparkUI to 0.0.0.0, and started at http://30.50.88.62:4040
2019-08-29 10:00:49,308 [main] INFO  - Starting executor ID driver on host localhost
2019-08-29 10:00:49,335 [main] INFO  - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 55631.
2019-08-29 10:00:49,335 [main] INFO  - Server created on 30.50.88.62:55631
2019-08-29 10:00:49,337 [main] INFO  - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
2019-08-29 10:00:49,375 [main] INFO  - Registering BlockManager BlockManagerId(driver, 30.50.88.62, 55631, None)
2019-08-29 10:00:49,380 [dispatcher-event-loop-2] INFO  - Registering block manager 30.50.88.62:55631 with 2004.6 MB RAM, BlockManagerId(driver, 30.50.88.62, 55631, None)
2019-08-29 10:00:49,384 [main] INFO  - Registered BlockManager BlockManagerId(driver, 30.50.88.62, 55631, None)
2019-08-29 10:00:49,385 [main] INFO  - Initialized BlockManager: BlockManagerId(driver, 30.50.88.62, 55631, None)
2019-08-29 10:00:49,402 [main] INFO  - Started o.s.j.s.ServletContextHandler@2202c92f{/metrics/json,null,AVAILABLE,@Spark}
2019-08-29 10:00:50,492 [main] INFO  - Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/usr/local/qsql-0.6/spark-warehouse').
2019-08-29 10:00:50,492 [main] INFO  - Warehouse path is 'file:/usr/local/qsql-0.6/spark-warehouse'.
2019-08-29 10:00:50,503 [main] INFO  - Started o.s.j.s.ServletContextHandler@949f0d{/SQL,null,AVAILABLE,@Spark}
2019-08-29 10:00:50,503 [main] INFO  - Started o.s.j.s.ServletContextHandler@4b3eaf39{/SQL/json,null,AVAILABLE,@Spark}
2019-08-29 10:00:50,504 [main] INFO  - Started o.s.j.s.ServletContextHandler@f73a7cf{/SQL/execution,null,AVAILABLE,@Spark}
2019-08-29 10:00:50,504 [main] INFO  - Started o.s.j.s.ServletContextHandler@4198921f{/SQL/execution/json,null,AVAILABLE,@Spark}
2019-08-29 10:00:50,506 [main] INFO  - Started o.s.j.s.ServletContextHandler@328e687e{/static/sql,null,AVAILABLE,@Spark}
2019-08-29 10:00:51,302 [main] INFO  - Registered StateStoreCoordinator endpoint
2019-08-29 10:00:51,337 [main] INFO  - Elasticsearch Hadoop v6.2.4 [0dadc1ea14]
2019-08-29 10:00:54,332 [main] INFO  - Pruning directories with:
2019-08-29 10:00:54,337 [main] INFO  - Post-Scan Filters: (length(trim(value#10, None)) > 0)
2019-08-29 10:00:54,342 [main] INFO  - Output Data Schema: struct<value: string>
2019-08-29 10:00:54,353 [main] INFO  - Pushed Filters:
2019-08-29 10:00:54,491 [main] INFO  - Code generated in 30.945461 ms
2019-08-29 10:00:54,560 [main] INFO  - Stopped Spark@563392e5{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2019-08-29 10:00:54,561 [main] INFO  - Stopped Spark web UI at http://30.50.88.62:4040
2019-08-29 10:00:54,573 [dispatcher-event-loop-7] INFO  - MapOutputTrackerMasterEndpoint stopped!
2019-08-29 10:00:54,585 [main] INFO  - MemoryStore cleared
2019-08-29 10:00:54,586 [main] INFO  - BlockManager stopped
2019-08-29 10:00:54,595 [main] INFO  - BlockManagerMaster stopped
2019-08-29 10:00:54,599 [dispatcher-event-loop-4] INFO  - OutputCommitCoordinator stopped!
2019-08-29 10:00:54,605 [main] INFO  - Successfully stopped SparkContext
Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
	at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247)
	at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:337)
	at org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
	at org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3278)
	at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2489)
	at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2489)
	at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259)
	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
	at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258)
	at org.apache.spark.sql.Dataset.head(Dataset.scala:2489)
	at org.apache.spark.sql.Dataset.take(Dataset.scala:2703)
	at org.apache.spark.sql.execution.datasources.csv.TextInputCSVDataSource$.infer(CSVDataSource.scala:148)
	at org.apache.spark.sql.execution.datasources.csv.CSVDataSource.inferSchema(CSVDataSource.scala:63)
	at org.apache.spark.sql.execution.datasources.csv.CSVFileFormat.inferSchema(CSVFileFormat.scala:57)
	at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$8.apply(DataSource.scala:203)
	at org.apache.spark.sql.execution.datasources.DataSource$$anonfun$8.apply(DataSource.scala:203)
	at scala.Option.orElse(Option.scala:289)
	at org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:202)
	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:393)
	at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
	at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:596)
	at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:473)
	at Requirement52045.execute(Requirement52045.java:33)
	at com.qihoo.qsql.exec.result.JobPipelineResult.run(JobPipelineResult.java:39)
	at com.qihoo.qsql.CsvJoinWithEsExample.main(CsvJoinWithEsExample.java:24)
Caused by: com.fasterxml.jackson.databind.JsonMappingException: Incompatible Jackson version: 2.10.0-pr1
	at com.fasterxml.jackson.module.scala.JacksonModule$class.setupModule(JacksonModule.scala:64)
	at com.fasterxml.jackson.module.scala.DefaultScalaModule.setupModule(DefaultScalaModule.scala:19)
	at com.fasterxml.jackson.databind.ObjectMapper.registerModule(ObjectMapper.java:779)
	at org.apache.spark.rdd.RDDOperationScope$.<init>(RDDOperationScope.scala:82)
	at org.apache.spark.rdd.RDDOperationScope$.<clinit>(RDDOperationScope.scala)
	... 28 more

do spark-submit job cause spark worker exited due to a missing qsql jar exception

when i do this command "./spark-submit --master spark://master_url:7077 --deploy-mode cluster --class com.qihoo.qsql.CsvJoinWithEsExample qsql-example-0.5.jar" , then show the following exception:

Exception in thread "main" java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.worker.DriverWrapper$.main(DriverWrapper.scala:65)
at org.apache.spark.deploy.worker.DriverWrapper.main(DriverWrapper.scala)
Caused by: java.lang.NoClassDefFoundError: com/qihoo/qsql/utils/PropertiesReader
at com.qihoo.qsql.env.RuntimeEnv.(RuntimeEnv.java:17)
at com.qihoo.qsql.CsvJoinWithEsExample.main(CsvJoinWithEsExample.java:11)
... 6 more
Caused by: java.lang.ClassNotFoundException: com.qihoo.qsql.utils.PropertiesReader
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 8 more

Modifying the external database is an exception

Issue : Modifying the external database is an exception

QSQL Version : 0.5

OS : Centos 7

Action:

  1. Modify the metadata.properties configuration file:
    vim metadata.properties
  2. Initialize the metabase:
    metadata --dbType mysql --action init
  3. Select table from mysql DB
    qsql -e 'select * from test_table_sqoop1'

Exception :

2019-01-17 14:59:10,718 [main] INFO - Parsing table names has finished, you will query tables: [test.test_table_sqoop1] Exception in thread "main" java.lang.RuntimeException: java.sql.SQLException: path to '///usr/local/qsql-0.5/bin/../sqlite/schema.db': '/usr/local/qsql-0.5/bin/../sqlite' does not exist at com.qihoo.qsql.metadata.MetadataClient.createConnection(MetadataClient.java:246) at com.qihoo.qsql.metadata.MetadataClient.<init>(MetadataClient.java:38) at com.qihoo.qsql.metadata.MetadataPostman$MetadataFetcher.transformSchemaFormat(MetadataPostman.java:86) at com.qihoo.qsql.metadata.MetadataPostman$MetadataFetcher.access$100(MetadataPostman.java:74) at com.qihoo.qsql.metadata.MetadataPostman.lambda$getAssembledSchema$0(MetadataPostman.java:50) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at com.qihoo.qsql.metadata.MetadataPostman.getAssembledSchema(MetadataPostman.java:52) at com.qihoo.qsql.launcher.ExecutionDispatcher.tryToExecuteQueryDirectly(ExecutionDispatcher.java:111) at com.qihoo.qsql.launcher.ExecutionDispatcher.main(ExecutionDispatcher.java:71) Caused by: java.sql.SQLException: path to '///usr/local/qsql-0.5/bin/../sqlite/schema.db': '/usr/local/qsql-0.5/bin/../sqlite' does not exist at org.sqlite.core.CoreConnection.open(CoreConnection.java:192) at org.sqlite.core.CoreConnection.<init>(CoreConnection.java:76) at org.sqlite.jdbc3.JDBC3Connection.<init>(JDBC3Connection.java:26) at org.sqlite.jdbc4.JDBC4Connection.<init>(JDBC4Connection.java:24) at org.sqlite.SQLiteConnection.<init>(SQLiteConnection.java:45) at org.sqlite.JDBC.createConnection(JDBC.java:114) at org.sqlite.JDBC.connect(JDBC.java:88) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:270) at com.qihoo.qsql.metadata.MetadataClient.createConnection(MetadataClient.java:242) ... 14 more

Run CsvJoinWithEsExample example error, maybe antlr4 jar is Incompatible

the error message is:
ANTLR Tool version 4.7 used for code generation does not match the current runtime version 4.5.3ANTLR Runtime version 4.7 used for parser compilation does not match the current runtime version 4.5.3Exception in thread "main" java.lang.ExceptionInInitializerError
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parse(ParseDriver.scala:84)
at org.apache.spark.sql.execution.SparkSqlParser.parse(SparkSqlParser.scala:48)
at org.apache.spark.sql.catalyst.parser.AbstractSqlParser.parseTableIdentifier(ParseDriver.scala:49)
at org.apache.spark.sql.Dataset.createTempViewCommand(Dataset.scala:3142)
at org.apache.spark.sql.Dataset.createOrReplaceTempView(Dataset.scala:3097)
at Requirement35141.execute(Requirement35141.java:75)
at com.qihoo.qsql.exec.result.JobPipelineResult.run(JobPipelineResult.java:39)
at com.qihoo.qsql.CsvJoinWithEsExample.main(CsvJoinWithEsExample.java:23)
Caused by: java.lang.UnsupportedOperationException: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN; Could not deserialize ATN with UUID 59627784-3be5-417a-b9eb-8131a7286089 (expected aadb8d7e-aeef-4415-ad2b-8204d6cf042e or a legacy UUID).
at org.antlr.v4.runtime.atn.ATNDeserializer.deserialize(ATNDeserializer.java:153)
at org.apache.spark.sql.catalyst.parser.SqlBaseLexer.(SqlBaseLexer.java:1175)
... 8 more
Caused by: java.io.InvalidClassException: org.antlr.v4.runtime.atn.ATN; Could not deserialize ATN with UUID 59627784-3be5-417a-b9eb-8131a7286089 (expected aadb8d7e-aeef-4415-ad2b-8204d6cf042e or a legacy UUID).
... 10 more

[QSQL-xxxx]: Wish more detailed configurations in QuickSQL backend metastore DB.

Please give us more detailed configurations in QuickSQL backend metastore DB.

What's the relationship between these four tables, what the each table column means.

If I want to do SQL Querys in Elasticsearch using QuickSQL, how do I configure backend metastore DB ? Give an example about Elasticsearch index.type and QuickSQL backend metastore configurations.

Thanks!

screen shot 2019-01-21 at 08 56 19

运行配置问题

请问base-env.sh设置环境的环境变量,除了Java_home和spark_home,还有QSQL_CLUSTER_URL和QSQL_HDFS_TMP,请问这两个路径是怎么配置的?

Unable to do SQL query examples in spark runner mode

Unable to do SQL query examples in --runner spark mode.

  • OS: CentOS 7 Linux
  • Java: OpenJDK 1.8.0_191-b12
  • Hadoop: Hadoop 2.9.2
  • Spark: Spark 2.4.0 ( Using Scala version 2.11.12, OpenJDK 64-Bit Server VM, 1.8.0_191 )

screenshot from 2019-01-20 14-10-45

0.6 的./run-example com.qihoo.qsql.CsvJoinWithEsExample 报错

2019-08-27 10:11:06,781 [main] ERROR - org.apache.calcite.sql.validate.SqlValidatorException: Column 'name' not found in table 'depts'
2019-08-27 10:11:06,782 [main] ERROR - org.apache.calcite.runtime.CalciteContextException: From line 1, column 126 to line 1, column 129: Column 'name' not found in table 'depts'
Exception in thread "main" com.qihoo.qsql.exception.ParseException: Error When Validating: org.apache.calcite.runtime.CalciteContextException: From line 1, column 126 to line 1, column 129: Column 'name' not found in table 'depts'
at com.qihoo.qsql.plan.QueryProcedureProducer.buildLogicalPlan(QueryProcedureProducer.java:174)
at com.qihoo.qsql.plan.QueryProcedureProducer.createQueryProcedure(QueryProcedureProducer.java:90)
at com.qihoo.qsql.api.DynamicSqlRunner.createQueryPlan(DynamicSqlRunner.java:65)
at com.qihoo.qsql.api.DynamicSqlRunner.sql(DynamicSqlRunner.java:79)
at com.qihoo.qsql.CsvJoinWithEsExample.main(CsvJoinWithEsExample.java:24)
Caused by: org.apache.calcite.tools.ValidationException: org.apache.calcite.runtime.CalciteContextException: From line 1, column 126 to line 1, column 129: Column 'name' not found in table 'depts'
at org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:190)
at com.qihoo.qsql.plan.QueryProcedureProducer.buildLogicalPlan(QueryProcedureProducer.java:169)
... 4 more
Caused by: org.apache.calcite.runtime.CalciteContextException: From line 1, column 126 to line 1, column 129: Column 'name' not found in table 'depts'
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:463)
at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:787)
at org.apache.calcite.sql.SqlUtil.newContextException(SqlUtil.java:772)
at org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError(SqlValidatorImpl.java:4776)
at org.apache.calcite.sql.validate.DelegatingScope.fullyQualify(DelegatingScope.java:439)
at org.apache.calcite.sql.validate.SqlValidatorImpl$Expander.visit(SqlValidatorImpl.java:5637)
at org.apache.calcite.sql.validate.SqlValidatorImpl$Expander.visit(SqlValidatorImpl.java:5619)
at org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:334)
at org.apache.calcite.sql.util.SqlShuttle$CallCopyingArgHandler.visitChild(SqlShuttle.java:134)
at org.apache.calcite.sql.util.SqlShuttle$CallCopyingArgHandler.visitChild(SqlShuttle.java:101)
at org.apache.calcite.sql.SqlOperator.acceptCall(SqlOperator.java:859)
at org.apache.calcite.sql.validate.SqlValidatorImpl$Expander.visitScoped(SqlValidatorImpl.java:5655)
at org.apache.calcite.sql.validate.SqlScopedShuttle.visit(SqlScopedShuttle.java:50)
at org.apache.calcite.sql.validate.SqlScopedShuttle.visit(SqlScopedShuttle.java:33)
at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:138)
at org.apache.calcite.sql.validate.SqlValidatorImpl.expand(SqlValidatorImpl.java:5226)
at org.apache.calcite.sql.validate.SqlValidatorImpl.validateJoin(SqlValidatorImpl.java:3092)
at org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3029)
at org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3290)
at org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
at org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
at org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:973)
at org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:949)
at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:216)
at org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:924)
at org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:628)
at org.apache.calcite.prepare.PlannerImpl.validate(PlannerImpl.java:188)
... 5 more
Caused by: org.apache.calcite.sql.validate.SqlValidatorException: Column 'name' not found in table 'depts'
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:463)
at org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:572)
... 31 more

编译的时候报错,对maven版本有要求吗?

编译源码的时候报错,是对maven版本有要求吗?
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:3.0.0:checkstyle (default) on project qsql: Unable to parse configuration of mojo org.apache.maven.plugins:maven-checkstyle-plugin:3.0.0:checkstyle for parameter sourceDirectories: Cannot assign configuration entry 'sourceDirectories' with value 'core/src/main/java' of type java.lang.String to property of type java.util.List -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:3.0.0:checkstyle (default) on project qsql: Unable to parse configuration of mojo org.apache.maven.plugins:maven-checkstyle-plugin:3.0.0:checkstyle for parameter sourceDirectories: Cannot assign configuration entry 'sourceDirectories' with value 'core/src/main/java' of type java.lang.String to property of type java.util.List
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:221)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:84)
at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:59)
at org.apache.maven.lifecycle.internal.LifecycleStarter.singleThreadedBuild(LifecycleStarter.java:183)
at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:161)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:320)
at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:156)
at org.apache.maven.cli.MavenCli.execute(MavenCli.java:537)
at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:196)
at org.apache.maven.cli.MavenCli.main(MavenCli.java:141)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:290)
at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:230)
at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:414)
at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:357)
Caused by: org.apache.maven.plugin.PluginConfigurationException: Unable to parse configuration of mojo org.apache.maven.plugins:maven-checkstyle-plugin:3.0.0:checkstyle for parameter sourceDirectories: Cannot assign configuration entry 'sourceDirectories' with value 'core/src/main/java' of type java.lang.String to property of type java.util.List
at org.apache.maven.plugin.internal.DefaultMavenPluginManager.populatePluginFields(DefaultMavenPluginManager.java:597)
at org.apache.maven.plugin.internal.DefaultMavenPluginManager.getConfiguredMojo(DefaultMavenPluginManager.java:529)
at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:92)
at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:209)
... 19 more
Caused by: org.codehaus.plexus.component.configurator.ComponentConfigurationException: Cannot assign configuration entry 'sourceDirectories' with value 'core/src/main/java' of type java.lang.String to property of type java.util.List
at org.codehaus.plexus.component.configurator.converters.AbstractConfigurationConverter.failIfNotTypeCompatible(AbstractConfigurationConverter.java:172)
at org.codehaus.plexus.component.configurator.converters.composite.CollectionConverter.fromConfiguration(CollectionConverter.java:107)
at org.codehaus.plexus.component.configurator.converters.ComponentValueSetter.configure(ComponentValueSetter.java:342)
at org.codehaus.plexus.component.configurator.converters.composite.ObjectWithFieldsConverter.processConfiguration(ObjectWithFieldsConverter.java:161)
at org.codehaus.plexus.component.configurator.BasicComponentConfigurator.configureComponent(BasicComponentConfigurator.java:56)
at org.apache.maven.plugin.internal.DefaultMavenPluginManager.populatePluginFields(DefaultMavenPluginManager.java:567)
... 22 more
[ERROR]
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/PluginConfigurationException

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.