Git Product home page Git Product logo

Comments (4)

laughingman7743 avatar laughingman7743 commented on May 24, 2024

Is the value of execution_time_in_millis that can be retrieved from the cursor object the same as the value of Athena's web console?
Perhaps it seems that fetching data of PyAthena is slow. If the result set is large, using PandasCursor is faster.
https://github.com/laughingman7743/PyAthena#pandascursor

For the comparison result between Cursor and PandasCursor, refer to the following GIST.
https://gist.github.com/laughingman7743/2e4d83ca4e394dc645e9ea9a45fe78ba

from pyathena.

ivssh avatar ivssh commented on May 24, 2024

@laughingman7743 Yes, the value of execution_time_in_millis and the value of Athena's web console is the same.

I edited the execute function of the Cursor class as below and logging in the execution times. The conclusion of 2-5 seconds delay is based on this.

    @synchronized
    def execute(self, operation, parameters=None):
        start = time.time()
        self._reset_state()
        self._query_id = self._execute(operation, parameters)
        query_execution = self._poll(self._query_id)
        diff1 = time.time() - start
        logging.info("Time taken for polling to finish: " + str(diff1))
        if query_execution.state == AthenaQueryExecution.STATE_SUCCEEDED:
            self._result_set = AthenaResultSet(
                self._connection, self._converter, query_execution, self.arraysize,
                self.retry_exceptions, self.retry_attempt, self.retry_multiplier,
                self.retry_max_delay, self.retry_exponential_base)
            diff2 = time.time() - start
            logging.info("Time taken for returning the result " + str(diff2))
        else:
            raise OperationalError(query_execution.state_change_reason)

from pyathena.

laughingman7743 avatar laughingman7743 commented on May 24, 2024

PyAthena only internally calls boto3 API (https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/athena.html). Is there a delay when executing directly in boto3?
Since PyAthena fetches 1000 rows when creating a ResultSet object, I think it is somewhat slow.

How about trying a new JDBC driver because it seems to process result set streaming?
https://docs.aws.amazon.com/athena/latest/ug/release-note-2018-08-16.html

from pyathena.

ivssh avatar ivssh commented on May 24, 2024

There is a delay even while executing directly from boto3. Please refer to this gist for the same.
https://gist.github.com/abhilash1994/3007d93352edea1dfbae04e132da41b7

from pyathena.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.