Git Product home page Git Product logo

virtual-schemas's Introduction

Virtual Schemas

virtual-schemas logo

Overview

Exasol Virtual Schemas are an abstraction layer that makes external data sources accessible in our data analytics platform through regular SQL commands. The contents of the external data sources are mapped to virtual tables which look like and can be queried as any regular Exasol table.

Regardless of whether that source is a relational database like ours, or it's like the structure of GitHub repositories, the interface that users see is always the same.

This means an user familiar with SQL will immediately feel at home when accessing remote data through Virtual Schemas.

Virtual Schemas can be also described in known terms as External Tables or Foreign Data Wrapper (FDW).

Version Requirements

The version of a Virtual Schema consists of two parts. The first part is the version of the Virtual Schema JDBC, the common basis that all JDBC-based Virtual Schemas share. Whereas the second part is the version of the Virtual Schema itself.

Thus the JAR filename of a Virtual Schema release complies with the following format:

virtual-schema-dist-<Virtual Schema JDBC Version>-<Virtual  Schema Name>-<Virtual Schema Version>.jar

For example, the JAR filename of the Oracle Virtual Schema 2.3.0 release, which is based on the 10.0.1 version of Virtual Schema JDBC, is:

virtual-schema-dist-10.0.1-oracle-2.3.0.jar

The version of the Virtual Schema JDBC on which a Virtual Schema is based also tells you whether it is supported or discontinued, as shown in the following table:

Virtual Schema JDBC Version Required Java Version Lifecycle
11.x.x 11 supported, active development
10.x.x 11 supported
9.x.x 11 discontinued
8.x.x 11 discontinued
7.x.x 11 discontinued
6.x.x 11 discontinued
5.x.x 11 discontinued
3.x.x 11 discontinued
2.x.x 9 discontinued
1.x.x 8 discontinued

Please update your Virtual Schema to a supported version before writing tickets or contacting Exasol Support.

Exasol Version Java Version Installed by Default in Language Container
8 11
7.1 11
7.0 11
6.2 11
6.1 9
6.0 8

We recommend updating the Exasol installation to at least 7.1.21 or newer for best results.

Please do not try to install language containers with lower Java versions on newer Exasol installations. This is untested.

Please contact the Exasol Support Team if you need help upgrading the language container.

The runtime dependencies are specified for each dialect in their own repository.

Deprecation Warning

Please use the entry point

com.exasol.adapter.RequestDispatcher

in all your CREATE JAVA ADAPTER SCRIPT statements. The old entry point was removed with the Version 2.0.0 of Exasol's Virtual Schema.

Features

  • Read only access to data on remote data sources (see the Supported Data Sources below)
  • Data in those sources appears as tables inside Exasol and can be queried using regular SQL statements.
  • Pushes down queries to the remote source (some sources)
  • Supports sources with no / one / multiple catalogs or schemas
  • Allows limiting metadata mapping to selected catalogs and / or schemas
  • Allows redirecting log output to a remote machine
  • Allows remote debugging with the Java Debugger

Supported Data Sources

See List of Supported Dialects.

Limitations

  • The Virtual Schema adapter takes about 1s to start. That means that queries that involve tables from Virtual Schema will take at least 1s.

Customer Support

This is an open source project officially supported by Exasol. Please contact our support team if you have any questions.

NOTE: Please report dialect specific issues in the corresponding dialect repository (see the Supported Data Sources above)

Table of Contents

Information for Users

Additional resources:

Information for Developers

virtual-schemas's People

Contributors

anastasiiasergienko avatar andrehacker avatar chiaradiamarcelo avatar ckunki avatar eddyueue avatar exagolo avatar exasol-szba avatar exazg avatar florian-wenzel avatar frschwab avatar hendrikcech avatar jakobbraun avatar kaklakariada avatar morazow avatar narmion avatar natasha-pel avatar ooke avatar pj-spoelders avatar redcatbear avatar sargul avatar snehlsen avatar thomasbestfleisch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

virtual-schemas's Issues

Code cleanup and improvements

In #75 some suggestions from the review were left for an extra refactoring branch.
Open review issues are:

  • Fix naming convert Jdbc to JDBC
  • Extract constants for JSON properties
  • Use Optional instead of returning null
  • Make string operations locale sensitive
  • Use parameter objects
  • Remove assert
  • Remove commented code
  • Use Hamcrest matchers in tests

EXCEPTION_HANDLING should work for all dialects

Problem:
If a view is invalid in Exasol, the JDBC driver seems to return zero columns for this view.
Then the Adapter throws an com.exasol.adapter.metadata.MetadataException: A table without columns was encountered: ADDRESSBOOK_EXTENDED_T. This is not supported. Please check if this table has columns. If the table does have columns, the dialect probably does not properly handle the data types of the columns.
There might be other cases where a JDBC driver returns zero columns.

Workaround
One can use the TABLE_FILTER property to select the tables or views which should be loaded.

Proposed Solution:
It should be investigated how EXCEPTION_HANDLING (currently only used for Teradata dialect) can be extended to be used for all dialects, so that the user has a better workaround than specifying every table to load.

Using RequestDispatcher and RemoteLogManager

Situation

In preparation for making the Virtual Schemas more extensible, we introduced the RequestDispatcher and RemoteLogManager in virtual-schema-common-java. In this ticket we are going to use them and remove the old code that statically dispatched to the JdbcAdapter.

Acceptance Criteria

  • JdbcAdapter gets requests from RequestDisptacher
  • Logging configuration is completely moved from the JdbcAdapter to the common module

Internal error on "CREATE VIRTUAL SCHEMA"

Steps:

CREATE SCHEMA adapter;

CREATE JAVA ADAPTER SCRIPT adapter.jdbc_adapter AS

 %scriptclass com.exasol.adapter.jdbc.JdbcAdapter;

 %jar /buckets/bucketfs/bfsdefault/default/virtualschema-jdbc-adapter-dist-0.0.1-SNAPSHOT.jar;

/
  • execute
CREATE CONNECTION exasol_conn TO 'jdbc:exa:192.168.99.100:8899' USER 'sys' IDENTIFIED BY 'exasol';

CREATE VIRTUAL SCHEMA virtual_exasol USING adapter.jdbc_adapter WITH
  SQL_DIALECT     = 'EXASOL'
  CONNECTION_NAME = 'EXASOL_CONN'
  SCHEMA_NAME     = 'default';

Result:
Error fired:
[22002] VM error: Internal error: VM crashed (Session: 1576534527788307527)

Review unfinished dialects

A few dialects are not finished yet.
They don't support capabilities and also contains a few TODO.
For example: MysqlSqlDialect

We should finish the work.

Move IMPORT generation to dialects

Problem

Generating the IMPORTthat encapsulates the pushdown SQL is currently done in the JDBCAdapter class. This does not make sense since it is dialect specific and bloats the JDBCAdapter class.

Solution

Create an interface in the adapter module that created the SQL statement. Create an abstract class that generates the default. Have the code of the individual adapters override the default implementation where necessary (like suggested in review of #95)

Cannot query identifiers with uppercase characters with PostgreSQL dialect

Handling PostgreSQL identifiers is difficult because PostgreSQL is not SQL standard compliant (see #74). Currently the PostgreSQL adapter is not able to query identifiers that contain uppercase characters. You can create a virtual schema, but you won't be able to query these identifiers.

A workaround is not to use quoting at all (see https://github.com/exasol/virtual-schemas/blob/master/jdbc-adapter/doc/sql_dialects/postgresql.md#postgres-identifiers), however if this is not possible the adapter should support identifiers with uppercase characters.

Support for additional properties for dialects

It would be great to be able to add additional properties for an individual dialect.

For example when using db2 or other databases that have more detailed timestamp one could tell the dialect to cast the timestamps to varchar (when precision matters) or to leave it the way (truncation).

In this case it would also be great if the NLS_TIMESTAMP_FORMAT and NLS_DATE_FORMAT Exasol Parameters could be accessed from within the dialect (-> casting to the correct format).

Regards

Charly

JDBC Adapter Doesn't Quote Keywords

Problem
The Exasol JDBC Adapter for Virtual Schemas doesn't quote column names which are reserved keywords in the remote database. This causes queries to fail on the remote database, if the user selects a column named after a keyword but doesn't select all columns from the table (i.e. no SELECT * in the pushdown query).

-- In the remote database:
CREATE TABLE "MONTH"(
"YEAR" VARCHAR(50) UTF8,
"MONTH" VARCHAR(50) UTF8,
YEAR_MONTH DATE);
-- In the local database:
ALTER VIRTUAL SCHEMA VS REFRESH;
SELECT "MONTH" FROM VS.MONTH;

Reverse JDB connection

Situation

In the current documentation the JDB connects to the VM listening for a debugger on the the Virtual Schema's side. This is a so called "forward connection". While that concept is easier to understand for users, it has multiple drawbacks -- all revolving around the fact that for this to be useful you need to be able to see the listening port from the outside.
And that means the port needs to be forwarded from the UDF container and through any virtual networks that might be involved.
This is definitely too much trouble for a simple task like debugging.
A better solution is to use a reverse connection, because UDFs are designed so that they can connect to the outside world.

Acceptance Criteria

  1. Remote debugging documentation is rewritten to use reverse connections
  2. An short introduction explains the concept of the reverse connection and why it is used here

Incorrect interpretation of filter on date

VS - virtual schema, source system is Oracle DB 12.2, jdbc driver ojdbc8.jar

The following query

select count(*) FROM VS.ONTIME_ALL WHERE
(C5 >= to_date('2009-05-01', 'YYYY-MM-DD') AND C5 < to_date('2009-05-31', 'YYYY-MM-DD')+1)

turns into

SELECT COUNT(*) FROM DWH.ONTIME_ALL WHERE C5 BETWEEN DATE '2009-05-01' AND DATE '2009-06-01'

on Oracle side.

GitHub virtual schema

Story

As an administrator of many GitHub repositories,
I want to regularly safe repository metadata in my Exasol database
so that I can run analytics on the collected data.

Acceptance Criteria

Since this will go into the first version of the virtual schema dialect, we will focus on some basic metadata

  • Virtual schema projects my repositories to a database table.
  • Virtual schema project the following repository metadata:
    • Name
    • Description
    • Visibility (private / public)
    • Default branch
    • Owner

API

We will use the following Java API (Licenses Apache 2.0 and MIT):
http://github-api.kohsuke.org

Builder for SQL dialect capabilities

Situation

Currently the Capabilities object is mutable and has setters that look similar to builder methods. Turning the implementation to an immutable object with a dedicated builder is cleaner.

Acceptance Criteria

  • Capabilities.Builder exists
  • Capabilites object can only be created through the builder
  • All occurrences using the current initialization are updated to use the builder

Password is exposed in EXPLAIN VIRTUAL for ORA connections

If a connection is used in a Virtual Schema that stores username and password, this information should not be exposed in the resulting pushdown SQL. This works fine for JDBC connections, but not for the special ORA connections:

create connection ora_connection to '(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=myhost)(PORT=1521)))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=orautf8)))' user 'foo' identified by 'bar';

create virtual schema oravs using adapter.jdbc_adapter
with connection_name = 'ora_jdbc_connection' 
sql_dialect='ORACLE' schema_name = 'LOADER' 
IMPORT_FROM_ORA = 'true'
ORA_CONNECTION_NAME = 'ora_connection';

Execution of explain virtual select x from oravs.t; returns the pushdown:
IMPORT FROM ORA AT ora_connection USER 'foo' IDENTIFIED BY 'bar' STATEMENT 'SELECT * FROM "LS"."T"'

Port tests to JUnit5

Situation

JUnit 5 is the successor of the JUnit 4 framework we are using at the moment. It has a better design, clearer life cycle handling and a uniform extension mechanism. In contrast JUnit 4 has two: Runners and Rules and they are not fully compatible.

Acceptance Criteria

  • Test suite is ported to JUnit 5
  • Rules are replaced by extensions (six occurrences at the time of this writing)
  • Runners are replaced by extensions (zero occurrences at the moment)
  • POMs are updated
  • Documentation is updated

Refactor metadata reading

Acceptance criteria

  • Reading remote metadata from JDBC is refactored
  • Reading remote metadata is covered by unit tests.

Remove duplicated SqlTestUtil class

Problem

SqlTestUtil exists in two places: test folder of this project and test folder of virtual-schema-common project.
We need to remove the duplication lately.

Acceptance criteria

  1. Find out if this is really test code (quick scan indicates it is)
  2. If it is test code, move it to test
  3. Find out if it needs to be in a static helper class or belongs somewhere else.
  4. If it stays in a static helper class:
  5. Give that class a proper name (util is not acceptable)
  6. Hide the constructor
  7. Make the class static

Read Redshift Spectrum metadata

Situation

Amazon's AWS Redshift Spectrum allows accessing tableau files located on S3 through Redshift. Unfortunately the projections that are called "external tables" in Redshift's terminology are not visible through the regular metadata as provided by the Redshift JDBC driver.

Currently it looks like we have to merge the regular table metadata obtained through the JDBC driver with information extracted from special tables that store the structure of the "external tables".

Acceptance Criteria

  1. The Redshift Virtual Schema sees the metadata for both regular tables and "external tables"

Check and fix the documentation if necessary

Problem

  1. There are probably outdated places in documentation after removing common part from the project.
    Need to review and fix it.

  2. Document carefully AdapterNotes: how this class works.

Remove or refactor `com.exasol.utils.SqlTestUtil`

Situation

I stumbled over a static helper class SqlTestUitl. Apart from the fact that the name is poorly chosen, the contstructor is not hidden, it is not final and it resides in the main source folder instead of test.

Acceptance criteria

  1. Find out if this is really test code (quick scan indicates it is)
  2. If it is test code, move it to test
  3. Find out if it needs to be in a static helper class or belongs somewhere else.
  4. If it stays in a static helper class:
    • Give that class a proper name (*util* is not acceptable)
    • Hide the constructor
    • Make the class static

Create an errors handbook

Problem:

There is no place where users can find information if they have an error message.

Proposed solution:

Create a handbook with possible errors, their descriptions and ways to handle these errors.

Remove `SqlDialect.handleException(...)`

Situation

This interface method is only used in the Teradata dialect. There is a trivial default implementation in AbstractSqlDialect.

Looking at the Teradata example I see that what this code intends to do is something different. The dialect should be able to decide not to iterate over columns of invalid views.

Acceptance Criteria

  1. Removed SqlDialect.handleException(...) and all invocations.
  2. Introduce a mechanism to decide per Dialect how to deal with column metadata that can't be accessed (like in the Teradata case)

`DROP VIRTUAL SCHEMA` must work even if debug output is misconfigured

Situation

In com.exasol.adapter.jdbc.JdbcAdapter.tryAttachToOutputService(SchemaMetadataInfo) an exception is thrown if the debug address cannot be parsed.
While that provides the user with a nice error message during creation of the VS, it also prevents dropping from working in case a wrong VS definition was used initially.
While it is highly unlikely that this happens, still dropping the schema must always work, because it is the "nuclear option" in case everything else fails.

Acceptance Criteria

  1. DROP VIRTUAL SCHEMA works -- regardless of any debug settings

Hive String columns are VARCHAR(255) in the Virtual Schema

A String column in a Hive Table has type VARCHAR(255) in the Virtual Schema. The Hive Dialect just uses the JDBC metadata from Hive. It looks like Hive claims that the length of a String column is 255, although it can hold up to 2GB of data.

Support postgres unquoted identifier handling

Problem

postgres is incompatible to the SQL standard when it comes to identifiers:

The folding of unquoted names to lower case in PostgreSQL is incompatible with the SQL standard, which says that unquoted names should be folded to upper case. Thus, foo should be equivalent to "FOO" not "foo" according to the standard.

(https://www.postgresql.org/docs/current/sql-syntax-lexical.html)

EXASOL is SQL standard compliant: all unquoted identifiers are folded to upper case. Since the adapter is quoting all identifiers (since v1.1.1), there is an upper case / lower case mismatch that leads to an error.

Proposed solution

The postgres adapter dialect converts all identifiers to lower case. This way the virtual schema integration between exasol and postgres will work seamlessly if you don't use quoted upper case identifiers in postgres.
If you have tables with upper case characters in postgres, creating or refreshing the virtual schema will result in an error. You can still create the schema if you set a FORCE_CREATE property when creating the virtual schema.

Inaccuracies in the Virtual Schema API Documentation

I found a few inaccuracies in the Virtual Schema API Documentation while using it.

  1. Missing literal_interval json example here: https://github.com/exasol/virtual-schemas/tree/master/doc#literal

  2. I discovered that "type": "predicate_equals" doesn't really work in our implementation, but "type": "predicate_equal" works fine. We need to fix the documentation or the implementation. You can find it here: https://github.com/exasol/virtual-schemas/tree/master/doc#predicates

  3. The same for "predicate_notequals".

  4. The same for "predicate_lessequals".

  5. Missing predicate_is_null and predicate_is_not_null here: https://github.com/exasol/virtual-schemas/tree/master/doc#predicates

Oracle dialect uses 'true' for any column

For some queries, EXASOL requests any column. The current oracle dialect uses "TRUE" in the select list. However, Oracle does not support this.

Example:

SELECT 1 FROM VS.T T1 LEFT JOIN SYS.DUAL D1 ON 1=1;

Error:
Oracle tool failed with error code '904' and message 'ORA-00904: "TRUE": invalid identifier'.

Document dependencies

Document dependencies, catagorized by module (common, dialect) and life cycle phase (built, run time, test):

  • Dependency name
  • Purpose / Description
  • License
  • Link to homepage (where you can find the sources)

In MSSQL, own schemas not seen, only default ones

I created the following adapter:

CREATE OR REPLACE JAVA ADAPTER SCRIPT adapter.jdbc_adapter_mssql AS
  // This is the class implementing the callback method of the adapter script
  %scriptclass com.exasol.adapter.jdbc.JdbcAdapter;

  // This will add the adapter jar to the classpath so that it can be used inside the adapter script
  // Replace the names of the bucketfs and the bucket with the ones you used.
  %jar /buckets/bucketfs1/bucket_adapter/virtualschema-jdbc-adapter-dist-1.6.0.jar;

  // You have to add all files of the data source jdbc driver here (e.g. Hive JDBC driver files)
  %jar /buckets/bucketfs1/bucket_adapter/mssql-jdbc-7.2.1.jre8.jar;
/

Then I created a connection using SCHEMA_NAME = 'dbo'. It works fine.
However, when I set SCHEMA_NAME = 'export', I get to following error message:

SQL Error [22002]: VM error: 
com.exasol.adapter.AdapterException: Schema export does not exist. Available schemas: db_accessadmin, db_backupoperator, db_datareader, db_datawriter, db_ddladmin, db_denydatareader, db_denydatawriter, db_owner, db_securityadmin, dbo, guest, INFORMATION_SCHEMA, sys
Stack trace:
com.exasol.adapter.jdbc.JdbcMetadataReader.findSchema(JdbcMetadataReader.java:221)
com.exasol.adapter.jdbc.JdbcMetadataReader.readRemoteMetadata(JdbcMetadataReader.java:46)
com.exasol.adapter.jdbc.JdbcAdapter.readMetadata(JdbcAdapter.java:135)
com.exasol.adapter.jdbc.JdbcAdapter.readMetadata(JdbcAdapter.java:125)
com.exasol.adapter.jdbc.JdbcAdapter.handleCreateVirtualSchema(JdbcAdapter.java:118)
com.exasol.adapter.jdbc.JdbcAdapter.adapterCall(JdbcAdapter.java:44)

It looks like the adapter can only access the default MSSQL schemas like dbo, guest, sys, db_..., but not other created schemas.

Oracle dialect handles timestamps inconsistently

The oracle dialect handles timestamps differently depending on if you use a projection or not:

select * from docker_oravs.tstest; 
-->
TS                   TSTZ                TSLTZ
2013-09-11 12:55:05	2013-09-11 12:55:05	2013-09-11 19:55:05
select TS from docker_oravs.tstest; 
-->
TS                                  
11-SEP-13 12.55.05.123456 PM

The reason is that the projection adds TO_CHAR for timestamps and hence applies the NLS_TIMESTAMP_FORMAT.

Get Interval precision from system table in Exasol SQL dialect

Situation

Currently the precision of INTERVAL data types is hard-coded in the Exasol SQL dialect. The necessary information is not available via JDBC.

Acceptance Criteria

  • Ask the JDBC driver developers if it is possible to provide that precision via JDBC.
  • If yes: ticket for JDBC driver is created
  • If no: obtain precision from system tables

Integration Tests: Automatically Close Resultsets

Background:
Currently, the integration test framework allows to run queries on virtual tables using executeQuery(), however, the returned ResultSets are never closed. Having too many ResultSets probably causes problems at some point (they already caused problems with the Teradata jdbc driver, when the JdbcMetadataReader did not close his ResultSets, which is now fixed).

Proposed Improvement:
The AbstractIntegrationTest framework should somehow remember the returned ResultSets and close them all automatically at some kind of tearDown method at the end of a test. Or we should find any other way to make sure that resultsets are closed as quickly as possible automatically.

error messages are not helpful

Error messages are not helpful with the current JDBC adapter (version 1.1.0), e.g.:
If the connection I am using in a virtual schema is broken, I get this error message:

[Code: 0, SQL State: 22002]  VM error: 
java.lang.Exception: Unexpected error in adapter for following request: {
	"schemaMetadataInfo" : 
	{
		"name" : "VIRTUAL_EXASOL",
		"properties" : 
		{
			"CONNECTION_NAME" : "EXASOLV6",
			"EXA_CONNECTION_STRING" : "10.48.106.23:42627",
			"IMPORT_FROM_EXA" : "true",
			"SCHEMA_NAME" : "TEST_SCHEMA",
			"SQL_DIALECT" : "EXASOL"
		}
	},
	"type" : "createVirtualSchema"
}
Response: 
Stack trace:
com.exasol.adapter.jdbc.JdbcAdapter.adapterCall(JdbcAdapter.java:70)
 (Session: 1617565938970599020)

Which is totally useless and unclear without looking at the logfiles.

In version 1.0.1 I get a backtrace, which make the error message useful:

[Code: 0, SQL State: 22002]  VM error: 
java.lang.Exception: Unexpected error in adapter: Connection refused (Connection refused)
Stacktrace: com.exasol.jdbc.ConnectFailed: Connection refused (Connection refused)
	at com.exasol.jdbc.AbstractEXAConnection.setupConnection(AbstractEXAConnection.java:1295)
	at com.exasol.jdbc.AbstractEXAConnection.Connect(AbstractEXAConnection.java:1177)
	at com.exasol.jdbc.AbstractEXAConnection.<init>(AbstractEXAConnection.java:435)
	at com.exasol.jdbc.EXAConnection.<init>(EXAConnection.java:38)
	at com.exasol.jdbc.EXADriver.connect(EXADriver.java:159)
	at java.sql.DriverManager.getConnection(DriverManager.java:664)
	at java.sql.DriverManager.getConnection(DriverManager.java:208)
	at com.exasol.adapter.jdbc.JdbcMetadataReader.establishConnection(JdbcMetadataReader.java:88)
	at com.exasol.adapter.jdbc.JdbcMetadataReader.readRemoteMetadata(JdbcMetadataReader.java:33)
	at com.exasol.adapter.jdbc.JdbcAdapter.readMetadata(JdbcAdapter.java:110)
	at com.exasol.adapter.jdbc.JdbcAdapter.readMetadata(JdbcAdapter.java:102)
	at com.exasol.adapter.jdbc.JdbcAdapter.handleCreateVirtualSchema(JdbcAdapter.java:96)
	at com.exasol.adapter.jdbc.JdbcAdapter.adapterCall(JdbcAdapter.java:62)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at com.exasol.ExaWrapper.runSingleCall(ExaWrapper.java:95)

For following request: {
	"schemaMetadataInfo" : 
	{
		"name" : "VIRTUAL_EXASOL",
		"properties" : 
		{
			"CONNECTION_NAME" : "EXASOLV6",
			"EXA_CONNECTION_STRING" : "10.48.106.23:42627",
			"IMPORT_FROM_EXA" : "true",
			"SCHEMA_NAME" : "TEST_SCHEMA",
			"SQL_DIALECT" : "EXASOL"
		}
	},
	"type" : "createVirtualSchema"
}
Response: 
Stack trace:
com.exasol.adapter.jdbc.JdbcAdapter.adapterCall(JdbcAdapter.java:90)
 (Session: 1617565938970599020)

Support for native import from Oracle

Background:
Virtual schemas for Oracle data sources currently uses IMPORT FROM JDBC internally to retrieve the data. This limits performance and offers potential for improvement. Exasol has a built-in IMPORT FROM ORA command to read data from an Oracle database, but this command is not yet used. This command has a performance advantage (implemented in C++ and can load in parallel) over the JDBC driver while providing the same features.

Proposed improvement:
To improve on the current performance situation, a new IMPORT_FROM_ORA option will be introduced. When activated, the virtual schema will use the IMPORT FROM ORA command to fetch data from the Oracle database. The JDBC connection will still be required to fetch meta data (e.g., schemas, table columns, etc.). Additionally an Oracle connection string will be needed which is used in IMPORT_FROM_ORA.

This feature is inspired by the already existing IMPORT_FROM_EXA property, which allows using the faster IMPORT FROM EXA for Exasol data sources.

Fix Eclipse warnings

Situation

As of version 1.1.0 of the virtual schemas Eclipse shows 24 warnings for the code.

  • Map is a raw type. References to generic type Map<K,V> should be parameterized (5x)
  • Map.Entry is a raw type. References to generic type Map.Entry<K,V> should be parameterized
  • The import com.exasol.adapter.sql.AggregateFunction is never used (3x)
  • The import com.exasol.adapter.sql.Predicate is never used (2x)
  • The import java.util.ArrayList is never used
  • The import java.util.List is never used
  • The method getException(String) from the type ColumnAdapterNotes is never used locally
  • The serializable class AdapterException does not declare a static final serialVersionUID field of type long
  • The serializable class InvalidPropertyException does not declare a static final serialVersionUID field of type long
  • The serializable class MetadataException does not declare a static final serialVersionUID field of type long
  • The value of the local variable generator is not used (2x)
  • The value of the local variable node is not used (1x)
  • The value of the local variable result is not used (3x)
  • Type safety: Unchecked cast from Object to T (1x)

Acceptance Criteria

  1. All Eclipse warnings are fixed

Deprecate use of properties CONNECTION_STRING, USERNAME and PASSWORD

We will deprecate the use of the properties CONNECTION_STRING, USERNAME and PASSWORD.

Background

Using these properties the username and password are exposed through the virtual schema requests and responses. It is highly recommended to encapsulate this data in a dedicated connection object.

Todos

  • Remove properties from documentation.
  • Add a warning and explanation to documentation.

Future plans

Throw an error if these properties are used and add the option to ignore this error. This is a breaking change and will be introduced at a later time.

Make type conversions for ORACLE NUMBER type configurable

Current status
In order to use datatypes of remote systems, that are not implemented in EXASOL, these types have to be casted. Currently these cast rules are hard coded for each dialect.

Proposed change
I would like to make these cast rules configurable at virtual schema creation time, so that the predefined rules can be altered and extended. This way new types (a database extension or module) can be integrated easily. Furthermore some cast rules are ambiguous (is an Oracle NUMBER best casted to VARCHAR or DECIMAL?) and could be easily adapted for the particular use case.

[04000] More than one identity column returned by adapter for table ALL_DATATYPES

In some databases like postgresql it's possible to have more than one identity/autoincrement column per table.
As this is not allowed in EXASOL the exception

[04000] More than one identity column returned by adapter for table

is thrown.

To solve this rare situation, it would be great to have a global property like IGNORE_AUTOINCREMENT_CONSTRAINTS=true in the adapter definition.

Top-level UML model

As a contributor,
I want to see an UML model of the Virtual Schema's top-level architecture
so that I can understand it quicker and more easily.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.