Git Product home page Git Product logo

zapr-athena-client's Introduction

ZAPR AWS Athena Client

ZAPR AWS athena client is a python library to run the presto query on the AWS Athena.

At Zapr we have the largest repository of offline media consumption and we try to answer some of the hardest questions of brands, broadcasters and marketers, on top of this data. To make all this happen we have churn TBs of data in a somewhat interactive manner. AWS Athena comes to rescue to help the team achieve this. We are using this client as a middleware to submit the queries to Athena, as Athena has few shortcomings that we have tried to solve through this client. Athena lacks in :

1. Submitting multiple queries at a time.
2. Insert overwrite is not supported in Athena.
3. Dropping of table doesn't delete the data, only schema is dropped.

Another benefit that we achieve using this client is that we can integrate Athena easily to all our existing data pipelines built on oozie, airflow.

Supported Features

  • submit the multiple queries from single file.
  • insert overwrite.
  • drop table (drop the table and delete the data as well).
  • submitting the query by using aws athena workgroup. so we can track the cost of the query.

Quick Start

Prerequisite

  • boto3
  • configparser

Usage

Syntax

python athena_client.py config_file_location workgroup_name query_file_location  input_macro1 input_macro2 ...

Install dependencies

pip install -r requirements.txt

Example - 1

python athena_client.py config.ini workgroup_testing_team sample-query-file.sql start_date=2020-09-25 end_date=2020-09-25

Example - 2

python athena_client.py s3://sampe-bucket/sample-prefix/project-1/config.ini workgroup_testing_team s3://sampe-bucket/sample-prefix/project-1/sample-query-file.sql start_date=2020-09-25 end_date=2020-09-25

Via PIP

pip install zapr-athena-client
zapr-athena-client config.ini workgroup_testing_team sample-query-file.sql start_date=2020-09-25 end_date=2020-09-25

Sample Query

create table sample_db.${table_prefix}_username
WITH (external_location = 's3://sample_db/${table_prefix}_username/',format = 'ORC') as
    select username
    from raw_db.users
    where date between '${start_date}' and '${end_date}';

Disable Insert Overwrite and drop data

This athena client supports insert overwrite table and delete data if you are executing drop table query by default. We can add the following configurations to disable these features.

ENABLE_INSERT_OVERWRITE = False
ENABLE_EXTERNAL_TABLE_DROP = False

Contact

For any features or bugs, please raise it in issues section

If anything else, get in touch with us at [email protected]

zapr-athena-client's People

Contributors

karthick-zapr avatar

Stargazers

 avatar

Watchers

 avatar  avatar

zapr-athena-client's Issues

Drop table if not exists

AWS athena throws an exception if we are dropping the table which is not present in the athena.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.