Comments (6)
If the query contains %
characters it will be doubly escaped. 😭
Your modification seems to be good. 👍
from pyathena.
If you set the _double_percents
option of IdentifierPreparer to False
, an error will occur if the %
character and the query parameter are contained.
import sqlalchemy
from urllib.parse import quote_plus
from sqlalchemy.engine import create_engine
def main():
conn_str = 'awsathena+rest://:@athena.{region_name}.amazonaws.com:443/' \
'{schema_name}?s3_staging_dir={s3_staging_dir}'
engine = create_engine(conn_str.format(
region_name='us-west-2',
schema_name='default',
s3_staging_dir=quote_plus('s3://BUCKET/path/to/')))
session = engine.connect()
query = sqlalchemy.sql.text("select date_parse('20191030', '%Y%m%d')")
print(session.execute(query).fetchall()) # return [(datetime.datetime(2019, 10, 30, 0, 0),)]
query = sqlalchemy.sql.text("select date_parse('20191030', '%Y%m%d'), :word")
print(session.execute(query, word='cat').fetchall()) # raise exception
if __name__ == '__main__':
main()
Traceback (most recent call last):
File "/Users/laughingman7743/github/PyAthena/sqla_query_with_parameter.py", line 28, in <module>
main()
File "/Users/laughingman7743/github/PyAthena/sqla_query_with_parameter.py", line 24, in main
print(session.execute(query, word='cat').fetchall())
File "/Users/laughingman7743/.virtualenvs/PyAthena-lwXSqVhO/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 948, in execute
return meth(self, multiparams, params)
File "/Users/laughingman7743/.virtualenvs/PyAthena-lwXSqVhO/lib/python3.6/site-packages/sqlalchemy/sql/elements.py", line 269, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/Users/laughingman7743/.virtualenvs/PyAthena-lwXSqVhO/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1060, in _execute_clauseelement
compiled_sql, distilled_params
File "/Users/laughingman7743/.virtualenvs/PyAthena-lwXSqVhO/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1200, in _execute_context
context)
File "/Users/laughingman7743/.virtualenvs/PyAthena-lwXSqVhO/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1416, in _handle_dbapi_exception
util.reraise(*exc_info)
File "/Users/laughingman7743/.virtualenvs/PyAthena-lwXSqVhO/lib/python3.6/site-packages/sqlalchemy/util/compat.py", line 249, in reraise
raise value
File "/Users/laughingman7743/.virtualenvs/PyAthena-lwXSqVhO/lib/python3.6/site-packages/sqlalchemy/engine/base.py", line 1193, in _execute_context
context)
File "/Users/laughingman7743/.virtualenvs/PyAthena-lwXSqVhO/lib/python3.6/site-packages/sqlalchemy/engine/default.py", line 509, in do_execute
cursor.execute(statement, parameters)
File "/Users/laughingman7743/github/PyAthena/pyathena/util.py", line 28, in _wrapper
return wrapped(*args, **kwargs)
File "/Users/laughingman7743/github/PyAthena/pyathena/cursor.py", line 38, in execute
self._query_id = self._execute(operation, parameters)
File "/Users/laughingman7743/github/PyAthena/pyathena/common.py", line 154, in _execute
query = self._formatter.format(operation, parameters)
File "/Users/laughingman7743/github/PyAthena/pyathena/formatter.py", line 112, in format
return (operation % kwargs).strip() if kwargs else operation.strip()
ValueError: unsupported format character 'Y' (0x59) at index 32
If formatting is done regardless of the query parameters as follows, the _double_percents
option seems to be OK with default True
setting.
https://github.com/laughingman7743/PyAthena/blob/master/pyathena/formatter.py#L112
return (operation % kwargs).strip()
from pyathena.
Argh! Of course that was too easy to work!
My use case is that I want to be able to copy sql statements that work in the Athena console and paste them directly into a Jupyter notebook, where the %%sql magic can execute them against Athena unmodified. Users can't be bothered to reformat the text with double percents.
Could it be that the real problem is that the visit_textclause() function in the default SQLCompiler class from sqlalchemy/sql/compiler.py calls self.post_process_text(textclause.text) -- which doubles percents -- REGARDLESS of whether there are any bind parameters or not? This behavior can be overridden and corrected by your AthenaCompiler class in pyathena/sqlalchemy_athena.py.
With the following changes only (the previous change I suggested is not needed):
from sqlalchemy.sql.compiler import BIND_PARAMS_ESC, BIND_PARAMS
class AthenaCompiler(SQLCompiler):
"""PrestoCompiler
https://github.com/dropbox/PyHive/blob/master/pyhive/sqlalchemy_presto.py"""
def visit_char_length_func(self, fn, **kw):
return 'length{0}'.format(self.function_argspec(fn, **kw))
def visit_textclause(self, textclause, **kw):
def do_bindparam(m):
name = m.group(1)
if name in textclause._bindparams:
return self.process(textclause._bindparams[name], **kw)
else:
return self.bindparam_string(name, **kw)
if not self.stack:
self.isplaintext = True
if len(textclause._bindparams) == 0:
return textclause.text
else:
# un-escape any \:params
return BIND_PARAMS_ESC.sub(
lambda m: m.group(1),
BIND_PARAMS.sub(
do_bindparam,
self.post_process_text(textclause.text))
)
I now get this:
>>> import sqlalchemy
>>> engine = sqlalchemy.create_engine(....)
>>> session = engine.connect()
>>> session.execute(sqlalchemy.text("select date_parse('20191030', '%Y%m%d' )")).fetchall()
[(datetime.datetime(2019, 10, 30, 0, 0),)]
>>> session.execute(sqlalchemy.text("select date_parse('20191030', '%Y%m%d' ), 'cat'")).fetchall()
[(datetime.datetime(2019, 10, 30, 0, 0), 'cat')]
>>> session.execute(sqlalchemy.text("select date_parse('20191030', '%Y%m%d' ), :word"), {'word':"cat"}).fetchall()
[(datetime.datetime(2019, 10, 30, 0, 0), 'cat')]
Thank you for your time in looking at this.
from pyathena.
It is the following method to override.
https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/sql/compiler.py#L759
It seems just to add the following conditions.
if len(textclause._bindparams) == 0:
return textclause.text
It feels good. 👍
I will add some test cases and check if there is no problem.
from pyathena.
I fixed it with the following pull request. And a few test cases are added.
#57
Please check it.
from pyathena.
Thank you so much! It works well for me!
from pyathena.
Related Issues (20)
- Sqlalchemy Create ICEBERG table wrong Partition clause HOT 3
- Python 3.10: No module named 'distutils.util' HOT 8
- PandasCursor interfering with pandas.DataFrame writing to s3 HOT 8
- Cannot specify Athena workgroup when using sqlalchemy HOT 2
- Add LIMIT as argument for execute HOT 3
- Support `json_serializer` and `json_deserializer` engine/dialect paremeters HOT 1
- SyntaxError when importing AthenaDialect HOT 5
- pyathena hijacks pandas s3fs HOT 2
- result_reuse_enable and result_reuse_minutes HOT 2
- Athena Reuse Query Results HOT 2
- Supporting FILTER and other similar operations HOT 2
- Support for Python 3.12
- Implement all fsspec specs in the s3 file system HOT 4
- Mypy Error When using Connection.cursor method to instantiate cursor HOT 2
- Add custom filesystem object to arrow engine HOT 2
- Compatibility issue with SQLAlchemy<1.4 HOT 2
- `UUID` in a query gets garbled HOT 3
- Add support for Spark calculations HOT 8
- Add Endpoint_URL param to SQLAlchemy HOT 2
- SQLAlchemy dialect uses deprecated dbapi() method HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pyathena.