Git Product home page Git Product logo

athena-cli's Introduction

Athena CLI

Build Status

Presto-like CLI tool for AWS Athena. The alternative is using the AWS CLI Athena sub-commands.

asciicast

Requirements

A recent version of the aws CLI must be available on the PATH.

Installation

To install using pip run:

$ pip install athena-cli

Or, clone the GitHub repo and run:

$ python setup.py install

Configuration

Only required configuration is AWS credentials.

Usage

$ athena --help
usage: athena [--debug] [--execute <statement>] [--output-format <format>] [--schema <schema>]
              [--profile <profile>] [--region <region>] [--s3-bucket <bucket>]
              [--server-side-encryption] [--version]

Athena interactive console

optional arguments:
  -h, --help            show this help message and exit
  --debug               enable debug mode
  --execute STATEMENT   execute statement in batch mode
  --output-format FORMAT
                        output format for batch mode [ALIGNED, VERTICAL, CSV,
                        TSV, CSV_HEADER, TSV_HEADER, NULL]
  --schema SCHEMA, --database SCHEMA, --db SCHEMA
                        default schema
  --profile PROFILE     AWS profile
  --region REGION       AWS region
  --s3-bucket BUCKET, --bucket BUCKET
                        AWS S3 bucket for query results
  --server-side-encryption, --encryption
                        Use server-side-encryption for query results
  --version             show version info and exit
athena> help

Supported commands:
QUIT
SELECT
ALTER DATABASE <schema>
ALTER TABLE <table>
CREATE DATABASE <schema>
CREATE TABLE <table>
DESCRIBE <table>
DROP DATABASE <schema>
DROP TABLE <table>
MSCK REPAIR TABLE <table>
SHOW COLUMNS FROM <table>
SHOW CREATE TABLE <table>
SHOW DATABASES [LIKE <pattern>]
SHOW PARTITIONS <table>
SHOW TABLES [IN <schema>] [<pattern>]
SHOW TBLPROPERTIES <table>
USE [<catalog>.]<schema>
VALUES row [, ...]

See http://docs.aws.amazon.com/athena/latest/ug/language-reference.html

Example

athena> use sampledb;
athena:sampledb> show tables;
 tab_name
------------
 elb_logs
(1 rows)

Query deb156b5-293e-472d-8897-5ee195b06b11, SUCCEEDED
https://eu-west-1.console.aws.amazon.com/athena/home?force&region=eu-west-1#query/history/deb156b5-293e-472d-8897-5ee195b06b11
Time: 0:00:00, CPU Time: 474ms total, Data Scanned: 0.00B, Cost: $0.00

Troubleshooting

Use the --debug option when launching the athena CLI to get AWS debug output:

$ athena --debug
2017-07-21 10:10:45,477 botocore.credentials [DEBUG] Looking for credentials via: env
2017-07-21 10:10:45,478 botocore.credentials [DEBUG] Looking for credentials via: assume-role
2017-07-21 10:10:45,478 botocore.credentials [DEBUG] Looking for credentials via: shared-credentials-file
2017-07-21 10:10:45,479 botocore.credentials [INFO] Found credentials in shared credentials file: ~/.aws/credentials
...

Turn on debug at the athena> prompt by typing:

athena> set debug true
debug - was: False
now: True

The below error means the version of aws CLI is not recent enough. Upgrade to version 1.10.18 or higher:

Command 'aws sts get-caller-identity --output text --query 'Account' --profile default' returned non-zero exit status 2

Command history is written to ~/.athena_history.

References

License

Athena CLI
Copyright 2017-2018 Guardian News & Media

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

athena-cli's People

Contributors

ianburrell avatar jfsoul avatar kant avatar kenoir avatar lumost avatar satterly avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

athena-cli's Issues

Downloading results is very slow

It seems that when running a query in batch mode, the results are downloaded via the API in small chunks, which is very slow. For instance, using athena-cli to --execute a query that returned 19 MB (50000 rows) took 81 sec (about 250k / sec), but the actual Athena finished in 3 seconds and aws s3 cp downloads the results in 1 second.

Any reason you don't just use S3 to fetch the results?

Python 3.x region_name error

bash-3.2$ athena --profile personal --db sampledb
Traceback (most recent call last):
  File "/usr/local/bin/athena", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/site-packages/athena_cli.py", line 373, in main
    shell = AthenaShell(profile, region, bucket, db=args.schema, debug=args.debug)
  File "/usr/local/lib/python3.6/site-packages/athena_cli.py", line 85, in __init__
    self.athena = Athena(profile, region, bucket, debug)
  File "/usr/local/lib/python3.6/site-packages/athena_cli.py", line 229, in __init__
    self.athena = session.client('athena')
  File "/usr/local/lib/python3.6/site-packages/boto3/session.py", line 263, in client
    aws_session_token=aws_session_token, config=config)
  File "/usr/local/lib/python3.6/site-packages/botocore/session.py", line 836, in create_client
    client_config=config, api_version=api_version)
  File "/usr/local/lib/python3.6/site-packages/botocore/client.py", line 71, in create_client
    verify, credentials, scoped_config, client_config, endpoint_bridge)
  File "/usr/local/lib/python3.6/site-packages/botocore/client.py", line 281, in _get_client_args
    verify, credentials, scoped_config, client_config, endpoint_bridge)
  File "/usr/local/lib/python3.6/site-packages/botocore/args.py", line 45, in get_client_args
    endpoint_url, is_secure, scoped_config)
  File "/usr/local/lib/python3.6/site-packages/botocore/args.py", line 111, in compute_client_args
    service_name, region_name, endpoint_url, is_secure)
  File "/usr/local/lib/python3.6/site-packages/botocore/client.py", line 354, in resolve
    service_name, region_name)
  File "/usr/local/lib/python3.6/site-packages/botocore/regions.py", line 122, in construct_endpoint
    partition, service_name, region_name)
  File "/usr/local/lib/python3.6/site-packages/botocore/regions.py", line 141, in _endpoint_for_partition
    if self._region_match(partition, region_name):
  File "/usr/local/lib/python3.6/site-packages/botocore/regions.py", line 159, in _region_match
    return re.compile(partition['regionRegex']).match(region_name)
TypeError: cannot use a string pattern on a bytes-like object

Headers are not being properly removed to do being encoded to bytes

I believe this commit: e88e32f broke the auto-header removal in yield_rows. The issue seems to be that the column names are encoded to 'utf8' in the headers array. This means they'll be a byte string like b'hostname'. But the data columns aren't byte strings, they're unicode strings like u'hostname'. At least in python3, these two values will not compare equal, so yield_rows thinks the first row of data is not the header row and prints it out.

AttributeError with recent versions of cmd2

Thanks for such a great tool.

It seems recent versions of cmd2 have broken compatibility again:

$ athena --schema default
athena:default> show databases;
EXCEPTION of type 'AttributeError' occurred with message: ''Statement' object has no attribute 'full_parsed_statement''
To enable full traceback, run the following command:  'set debug true'
athena:default>
$ pipenv graph
athena-cli==0.1.8
  - boto3 [required: Any, installed: 1.7.78]
    - botocore [required: <1.11.0,>=1.10.78, installed: 1.10.78]
      - docutils [required: >=0.10, installed: 0.14]
      - jmespath [required: >=0.7.1,<1.0.0, installed: 0.9.3]
      - python-dateutil [required: >=2.1,<3.0.0, installed: 2.7.3]
        - six [required: >=1.5, installed: 1.11.0]
    - jmespath [required: >=0.7.1,<1.0.0, installed: 0.9.3]
    - s3transfer [required: <0.2.0,>=0.1.10, installed: 0.1.13]
      - botocore [required: >=1.3.0,<2.0.0, installed: 1.10.78]
        - docutils [required: >=0.10, installed: 0.14]
        - jmespath [required: >=0.7.1,<1.0.0, installed: 0.9.3]
        - python-dateutil [required: >=2.1,<3.0.0, installed: 2.7.3]
          - six [required: >=1.5, installed: 1.11.0]
  - cmd2 [required: Any, installed: 0.9.3]
    - colorama [required: Any, installed: 0.3.9]
    - pyperclip [required: >=1.5.27, installed: 1.6.4]
    - wcwidth [required: Any, installed: 0.1.7]
  - tabulate [required: >=0.8.1, installed: 0.8.2]
pyparsing==2.2.0

If I downgrade cmd2 to 0.8.0, it works.

Output formats are not well-formatted

First, the argument to --output-format does not seem to be validated: --output-format foo will still work and give the default format. Second, it seems the argument is case-sensitive; CSV works but csv silently gives the default. It would be nice if I could use lower case.

Third, it seems there may be some byte vs. string / encoding issues, as many strings come out as the Python repr for binary data (b'...').

Here is the Athena ELB log example database:

athena --execute "SELECT * FROM "sampledb"."elb_logs" limit 3;"

 b'request_timestamp'        | b'elb_name'   | b'request_ip'   | b'request_port'   | b'backend_ip'   | b'backend_port'   | b'request_processing_time'   | b'backend_processing_time'   | b'client_response_time'   | b'elb_response_code'   | b'backend_response_code'   | b'received_bytes'   | b'sent_bytes'   | b'request_verb'   | b'url'                            | b'protocol'   | b'user_agent'                                                                                                       | b'ssl_cipher'      | b'ssl_protocol'
-----------------------------+---------------+-----------------+-------------------+-----------------+-------------------+------------------------------+------------------------------+---------------------------+------------------------+----------------------------+---------------------+-----------------+-------------------+-----------------------------------+---------------+---------------------------------------------------------------------------------------------------------------------+--------------------+-------------------
 request_timestamp           | elb_name      | request_ip      | request_port      | backend_ip      | backend_port      | request_processing_time      | backend_processing_time      | client_response_time      | elb_response_code      | backend_response_code      | received_bytes      | sent_bytes      | request_verb      | url                               | protocol      | user_agent                                                                                                          | ssl_cipher         | ssl_protocol
 2015-01-05T20:00:01.206255Z | elb_demo_002  | 240.220.175.143 | 26144             | 172.37.79.92    | 443               | 0.00166                      | 8.9E-4                       | 0.00167                   | 200                    | 200                        | 0                   | 1150            | GET               | http://www.example.com/images/989 | HTTP/1.1      | "Mozilla/5.0 (X11; CrOS x86_64 8172.45.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.64 Safari/537.36" | -                  | -
 2015-01-05T20:00:01.612598Z | elb_demo_008  | 244.189.63.245  | 25515             | 172.42.219.87   | 443               | 5.05E-4                      | 0.001697                     | 0.001113                  | 200                    | 200                        | 0                   | 3123            | GET               | http://www.example.com/jobs/862   | HTTP/1.1      | "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36"     | -                  | -
 2015-01-05T20:00:02.793335Z | elb_demo_008  | 249.110.119.93  | 26779             | 172.45.67.52    | 8888              | 0.001881                     | 0.001082                     | 0.001147                  | 500                    | 500                        | 0                   | 1978            | GET               | https://www.example.com/jobs/118  | HTTP/1.1      | "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36"     | DHE-RSA-AES128-SHA | TLSv1.2

athena --execute "SELECT * FROM "sampledb"."elb_logs" limit 3;" --output-format CSV

"b'request_timestamp'","b'elb_name'","b'request_ip'","b'request_port'","b'backend_ip'","b'backend_port'","b'request_processing_time'","b'backend_processing_time'","b'client_response_time'","b'elb_response_code'","b'backend_response_code'","b'received_bytes'","b'sent_bytes'","b'request_verb'","b'url'","b'protocol'","b'user_agent'","b'ssl_cipher'","b'ssl_protocol'"
"b'2015-01-06T16:00:00.516940Z'","b'elb_demo_003'","b'241.93.49.158'","b'1459'","b'172.34.104.223'","b'443'","b'7.13E-4'","b'4.53E-4'","b'0.001112'","b'200'","b'200'","b'0'","b'744'","b'GET'","b'http://www.example.com/articles/194'","b'HTTP/1.1'","b'""Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36""'","b'-'","b'-'"
"b'2015-01-06T16:00:00.902953Z'","b'elb_demo_008'","b'240.124.85.193'","b'17382'","b'172.50.75.155'","b'443'","b'7.1E-4'","b'6.09E-4'","b'0.001432'","b'200'","b'200'","b'0'","b'329'","b'GET'","b'http://www.example.com/images/813'","b'HTTP/1.1'","b'""Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36""'","b'-'","b'-'"
"b'2015-01-06T16:00:01.206255Z'","b'elb_demo_005'","b'248.193.50.89'","b'32739'","b'172.32.214.237'","b'80'","b'3.2E-4'","b'4.37E-4'","b'0.001608'","b'200'","b'200'","b'0'","b'4740'","b'GET'","b'http://www.example.com/images/692'","b'HTTP/1.1'","b'""Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/602.1.50 (KHTML, like Gecko) Version/10.0 Safari/602.1.50""'","b'-'","b'-'"

It's not clear what --output-format CSV_HEADER supposed to do, but it's adding a duplicate header:

"b'request_timestamp'","b'elb_name'","b'request_ip'","b'request_port'","b'backend_ip'","b'backend_port'","b'request_processing_time'","b'backend_processing_time'","b'client_response_time'","b'elb_response_code'","b'backend_response_code'","b'received_bytes'","b'sent_bytes'","b'request_verb'","b'url'","b'protocol'","b'user_agent'","b'ssl_cipher'","b'ssl_protocol'"
"b'request_timestamp'","b'elb_name'","b'request_ip'","b'request_port'","b'backend_ip'","b'backend_port'","b'request_processing_time'","b'backend_processing_time'","b'client_response_time'","b'elb_response_code'","b'backend_response_code'","b'received_bytes'","b'sent_bytes'","b'request_verb'","b'url'","b'protocol'","b'user_agent'","b'ssl_cipher'","b'ssl_protocol'"
"b'2015-01-01T08:00:00.516940Z'","b'elb_demo_009'","b'240.136.98.149'","b'25858'","b'172.51.67.62'","b'8888'","b'9.99E-4'","b'8.11E-4'","b'0.001561'","b'200'","b'200'","b'0'","b'428'","b'GET'","b'https://www.example.com/articles/746'","b'HTTP/1.1'","b'""Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/602.1.50 (KHTML, like Gecko) Version/10.0 Safari/602.1.50""'","b'DHE-RSA-AES128-SHA'","b'TLSv1.2'"
"b'2015-01-01T08:00:00.902953Z'","b'elb_demo_008'","b'244.46.184.108'","b'27758'","b'172.31.168.31'","b'443'","b'6.39E-4'","b'0.001471'","b'3.73E-4'","b'200'","b'200'","b'0'","b'4231'","b'GET'","b'https://www.example.com/jobs/688'","b'HTTP/1.1'","b'""Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:15.0) Gecko/20100101 Firefox/15.0.1""'","b'DHE-RSA-AES128-SHA'","b'TLSv1.2'"
"b'2015-01-01T08:00:01.206255Z'","b'elb_demo_008'","b'240.120.203.212'","b'26378'","b'172.37.170.107'","b'8888'","b'0.001174'","b'4.97E-4'","b'4.89E-4'","b'200'","b'200'","b'0'","b'2075'","b'GET'","b'http://www.example.com/articles/290'","b'HTTP/1.1'","b'""Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246""'","b'-'","b'-'"

--output-format TSV_HEADER is also odd:

b'request_timestamp'            b'elb_name'     b'request_ip'   b'request_port'         b'backend_ip'   b'backend_port'        b'request_processing_time'      b'backend_processing_time'      b'client_response_time'       b'elb_response_code'     b'backend_response_code'        b'received_bytes'       b'sent_bytes'   b'request_verb'        b'url'                                  b'protocol'     b'user_agent'                                                                                                                          b'ssl_cipher'         b'ssl_protocol'
request_timestamp               elb_name        request_ip      request_port            backend_ip      backend_port           request_processing_time         backend_processing_time         client_response_time          elb_response_code        backend_response_code           received_bytes          sent_bytes      request_verb           url                                     protocol        user_agent                                                                                                                             ssl_cipher            ssl_protocol
2015-01-06T08:00:00.516940Z     elb_demo_009    241.185.58.78   30989                   172.42.85.244   443                    1.66E-4                         0.001622                        4.72E-4                       200                      200                             0                       3124            GET                    http://www.example.com/jobs/692         HTTP/1.1        "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:15.0) Gecko/20100101 Firefox/15.0.1"                                                       -                     -
2015-01-06T08:00:00.902953Z     elb_demo_002    241.61.132.60   6917                    172.51.77.197   80                     9.66E-4                         1.35E-4                         1.63E-4                       200                      200                             0                       1676            GET                    https://www.example.com/jobs/941        HTTP/1.1        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246"      DHE-RSA-AES128-SHA    TLSv1.2
2015-01-06T08:00:01.206255Z     elb_demo_007    244.176.139.111 21236                   172.55.53.108   443                    2.56E-4                         0.0011                          0.001404                      200                      200                             0                       4974            GET                    https://www.example.com/images/809      HTTP/1.1        "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246"      DHE-RSA-AES128-SHA    TLSv1.2

Error when starting up

> athena --profile aws_profile --region eu-central-1 --schema database --s3-bucket s3://bucket/adhoc/

An error occurred (InvalidRequestException) when calling the StartQueryExecution operation: line 1:10: mismatched 
input '<EOF>'. Expecting: 'ALTER', 'ANALYZE', 'CALL', 'COMMIT', 'CREATE', 'DEALLOCATE', 'DELETE', 'DESC', 
'DESCRIBE', 'DROP', 'EXECUTE', 'EXPLAIN', 'GRANT', 'INSERT', 'PREPARE', 'RESET', 'REVOKE', 'ROLLBACK', 'SET', 
'SHOW', 'START', 'UNLOAD', 'UPDATE', 'USE', <query>

CSV writer is crashing with UTF-8 strings

When trying to use athena-cli in batch mode with CSV output, a exception is raised because of UTF-8 characters in the results:

Traceback (most recent call last):
  File "/usr/local/bin/athena", line 11, in <module>
    sys.exit(main())
  File "/usr/local/lib/python2.7/site-packages/athena_cli.py", line 402, in main
    batch.execute(statement=args.execute)
  File "/usr/local/lib/python2.7/site-packages/athena_cli.py", line 58, in execute
    csv_writer.writerows([row for row in self.athena.yield_rows(results, headers)])
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 13: ordinal not in range(128)

The problem seems to lie in the native csv library of Python 2.x which does not support UTF-8 strings by default.

Error on Redhat 7.4

I installed athenacli using pip3 and when i run athena i get the following error
athena
Traceback (most recent call last):
File "/usr/bin/athena", line 7, in
from athena_cli import main
File "/usr/lib/python3.4/site-packages/athena_cli.py", line 69, in
del cmd.Cmd.do_show # "show" is an Athena command
AttributeError: do_show

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.