Git Product home page Git Product logo

fido's People

Contributors

analogue avatar asottile avatar benbariteau avatar bplotnick avatar clarecat avatar danielpops avatar frei0 avatar jefftree avatar jnb avatar jolynch avatar lucagiovagnoli avatar macisamuele avatar msindwan avatar prat0318 avatar sjaensch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fido's Issues

Support for adding callbacks

I sometimes find myself wanting to use the callback approach instead of calling future.wait(). Would it be possible to add this feature for fido?

fido does not support underscores in domain name

>>> future = fido.fetch('https://i_have_underscores.com/path')
>>> print(future.wait(timeout=2))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/nail/home/siruitan/pg/yelp-main/virtualenv_run/lib/python2.7/site-packages/crochet/_eventloop.py", line 231, in wait
    result.raiseException()
  File "<string>", line 2, in raiseException
idna.core.InvalidCodepoint: Codepoint U+005F at position 5 of u'fake_platform_partner-testopia' not allowed

Please help, this issue breaks a non-trivial number of our integrated tests.

FidoClient NoCurrentExceptionError - CRITICAL :: twisted - Unhandled error in Deferred

We've compared the bravado/swaggerpy synchronous client with using fido with poor results.

Basically even though requests seem to go through, we've experienced around 15% slower timings and these messages coming out of logs :

[3521917] 2016-04-06 08:00:06,261 :: CRITICAL :: twisted - Unhandled error in Deferred: [3521917] 2016-04-06 08:00:06,265 :: CRITICAL :: twisted -

On the long run the unhandled exceptions seem to accumulate until the system breaks (my guess is that this happens only when fork is used together with the fido client which uses an underlying Twisted implementation which does not like forking (see also issue Yelp/bravado#179)).

After some debugging the exception type is NoCurrentExceptionError.

Throws timeout if header value is not given in list

In [22]: fido.fetch("http://google.com", headers={'Accept-Charset': 'utf-8'}).result(timeout=4)
---------------------------------------------------------------------------
TimeoutError                              Traceback (most recent call last)

/nail/home/agarwal/srv/swagger-py/<ipython console> in <module>()

/nail/home/agarwal/mypy/lib/python2.6/site-packages/concurrent/futures/_base.pyc in result(self, timeout)
    407                 return self.__get_result()
    408             else:
--> 409                 raise TimeoutError()
    410
    411     def exception_info(self, timeout=None):

TimeoutError:

Noisy logging on server timeout

I'm seeing a lot of these:

Unhandled error in EventualResult
Traceback (most recent call last):
  File "/nail/home/bariteau/pg/repl-delay-update-daemon/venv/lib/python3.6/site-packages/twisted/internet/defer.py", line 567, in _startRunCallbacks
    self._runCallbacks()
  File "/nail/home/bariteau/pg/repl-delay-update-daemon/venv/lib/python3.6/site-packages/twisted/internet/defer.py", line 653, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/nail/home/bariteau/pg/repl-delay-update-daemon/venv/lib/python3.6/site-packages/twisted/internet/defer.py", line 500, in errback
    self._startRunCallbacks(fail)
  File "/nail/home/bariteau/pg/repl-delay-update-daemon/venv/lib/python3.6/site-packages/twisted/internet/defer.py", line 567, in _startRunCallbacks
    self._runCallbacks()
--- <exception caught here> ---
  File "/nail/home/bariteau/pg/repl-delay-update-daemon/venv/lib/python3.6/site-packages/twisted/internet/defer.py", line 653, in _runCallbacks
    current.result = callback(current.result, *args, **kw)
  File "/nail/home/bariteau/pg/repl-delay-update-daemon/venv/lib/python3.6/site-packages/fido/fido.py", line 234, in handle_timeout_errors
    "send the response".format(timeout=timeout)
fido.exceptions.HTTPTimeoutError: Connection was closed by fido because the server took more than timeout=None seconds to send the response

With code that looks like:

import fido
from crochet import TimeoutError

future = fido.fetch(
    'http://example.com/resource',
)
try:
    future.wait(timeout=1.0)
except TimeoutError:
    pass
finally:
    future.cancel()

While hitting a server that takes longer than 1 second to respond.

I don't expect this to timeout all the time, but it happens often enough that it's a problem.

Provide a way in response to fetch the response completion timestamp.

It is sometimes necessary to find the time taken by the client call, but finding it for an async call is hard at application level because the response might have come much before the result() inspection is done.

I propose adding another param to Response ctor here:

self.finish_timestamp = time.time()

Which can then be read by application logic, eg. :

>>> client = SwaggerClient.from_url("http://localhost:12343/swagger.json", FidoClient())
>>> a = client.hours.getHours(biz_id=1)
>>> time.time()
1443127426.439981
>>> a.future.result().finish_timestamp
1443127423.185333

Which shows that finish_timestamp was recorded before result() (the blocking call) was called.

Timeout should not be None by default?

In d379944 we changed the timeouts from an awesome default of < infinity to None (blocking) for "principle of least surprise".

I propose instead we use the principle of "least likely to kill the website", which would indicate a default, tight timeout is the best option. Clients that want to make long requests can opt into a longer timeout and the vast majority of use cases can use a good safe sane default timeout.

Duplicate Content-Length in requests

Twisted adds a Content-Length header to http requests in addition to the one that Fido adds which creates a malformed http request for post requests.

POST /test HTTP/1.1
Connection: close
Content-Length: 17
Content-Length: 17
Content-Type: application/json
Host: localhost:6543
User-Agent: Fido/2.1.0

{"param":"value"}

Crochet does not cancel the deferred properly

While developing PR #28 I encountered an issue with crochet not terminating the twisted deferred correctly when raising the TimeoutError exception. I don't think it's a fault on our side but we should design how to handle it.

Let's take the following example code as a reference.

Code for reference

def test_eventual_result_timeout(server_url):

    # server incredibly slow
    eventual_result = fido.fetch(server_url + '100sec')

    # make sure no timeout error is thrown here but only on result retrieval
    assert eventual_result.original_failure() is None

    with pytest.raises(crochet.TimeoutError):
        eventual_result.wait(timeout=1.0)

    assert eventual_result.original_failure() is None
    print("Sleeping a bit to make sure a potential exception in the reactor"
    "thread due to connectionLost is trapped in eventual_result")
    time.sleep(5)
    assert eventual_result.original_failure() is None
    print("Finished sleeping. in a moment we will lose references to"
    "eventual_result and hence the deferred will be closed uncorrectly causing"
    "the CRITICAL message to be logged right after this message....")


@crochet.run_in_reactor
def fetch_inner(url, method, headers, body, timeout, connect_timeout):

    bodyProducer, headers = _build_body_producer(body, headers)

    deferred = Agent(reactor, connect_timeout).request(
        method=method,
        uri=url,
        headers=listify_headers(headers),
        bodyProducer=bodyProducer)

    def response_callback(response):
        finished = Deferred()
        response.deliverBody(HTTPBodyFetcher(response, finished))
        return finished

    deferred.addCallback(response_callback)

    return deferred


def fetch(
    url, method='GET', headers=None, body='', timeout=DEFAULT_TIMEOUT,
    connect_timeout=DEFAULT_CONNECT_TIMEOUT,
    content_type=DEFAULT_CONTENT_TYPE, user_agent=DEFAULT_USER_AGENT,
):

    crochet.setup()
    return fetch_inner(url, method, headers, body, timeout, connect_timeout)

Traceback
Everything goes fine (TimeoutError is correctly raised) but the twisted reactor thread is failing due to the connection closing incorrectly. Traceback:

Sleeping a bit to make sure a potential exception in the reactor thread due to 
connectionLost is trapped in eventual_result

Finished sleeping. in a moment we will lose references to eventual_result and hence
 the deferred will be closed uncorrectly causing the CRITICAL message to
 be logged right after this message....

CRITICAL:twisted:Unhandled error in EventualResult
Traceback (most recent call last):
Failure: twisted.web._newclient.ResponseNeverReceived: [<twisted.python.failure.Failure twisted.internet.error.ConnectionLost: Connection to the other side was lost in a non-clean fashion: Connection lost.>]

All assertion are true, showing that no exception is trapped in the EventualResult. Exception happens exactly when the eventual_result is garbage collected.

The problem

From Twisted documentation:

If a Deferred is garbage-collected with an unhandled error (i.e. it would call the next errback if there was one), then Twisted will write the error’s traceback to the log file. This means that you can typically get away with not adding errbacks and still get errors logged. Be careful though; if you keep a reference to the Deferred around, preventing it from being garbage-collected, then you may never see the error (and your callbacks will mysteriously seem to have never been called).

In this case the ConnectionLost exception is caused by the garbage collector disposing of the deferred when we lose all references to the EventualResult (and hence the deferred). In fact we don't see anything logged until the eventual_result is alive.

I think that raising TimeoutError here without disposing of the deferred is causing the Agent connection to be closed incorrectly and the exception to be thrown in the reactor thread after raising the TimeoutError.

Why crochet does not dispose of the deferred ?
Crochet choice is to allow multiple retries to collect results, so I guess they would not be ok with destroying the object after a first try. See crochet docstrings.

Why not handling twisted.internet.error.ConnectionLost with a twisted errback?
The problem is a systematic erroneous disposal of the deferred object causing the connection to be always incorrectly closed the same way. Let's say I was to handle this on our side this way:

    def handle_timeout_errors(error):
        if error.check(twisted.web.client.ResponseNeverReceived):
            if error.value.reasons[0].check(
                twisted.internet.error.ConnectionLost
            ):
                # we get here if crochet aborts the connection uncorrectly
                # when the EventualResult timeout is due. Nothing to raise
                # because crochet will take care of raising a TimeoutError.
                return
        return error

    deferred.addErrback(handle_timeout_errors)

I would be mistakenly catching all potential connectionLost issues, even the ones due to other causes (for example the server aborting the connection).

Possible approaches

  1. We document the error, we let it happen and we let the logger write "CRITICAL" all over our logs. We know why it happens so it would all be expected.
  2. Let's find a way to correctly destroy the deferred when loosing references to it before garbage collection. I've given a try to this approach by changing crochet code and it seems to be working (tested with our internal acceptance test suite):
if not self._result_set.is_set():
     self.cancel()
     raise TimeoutError()

I don't think crochet would like to change its entire default behavior on retries by destroying the object after the first try. Maybe an option could be something like this on stackoverflow (destroying the deferred ourselves).

Options 1 looks to me as the fast track. I'd go for option 2.

Is fido starting/stopping lots of http factory objects?

I see a lot of

>'
2017-01-18 00:06:08,485 c2a3a87c4d08    23      twisted INFO    b'Stopping factory <twisted.web.client._HTTP11ClientFactory object at 0x7f61e6c10dd8>'
2017-01-18 00:06:08,724 c2a3a87c4d08    23      twisted INFO    b'Starting factory <twisted.web.client._HTTP11ClientFactory object at 0x7f61e6985f60>'
2017-01-18 00:06:10,045 c2a3a87c4d08    23      twisted INFO    b'Stopping factory <twisted.web.client._HTTP11ClientFactory object at 0x7f61e6985f60>'
2017-01-18 00:06:10,311 c2a3a87c4d08    23      twisted INFO    b'Starting factory <twisted.web.client._HTTP11ClientFactory object at 0x7f61e6b15c18>'
2017-01-18 00:06:12,041 c2a3a87c4d08    23      twisted INFO    b'Stopping factory <twisted.web.client._HTTP11ClientFactory object at 0x7f61e6b15c18>'
2017-01-18 00:06:12,270 c2a3a87c4d08    23      twisted INFO    b'Starting factory <twisted.web.client._HTTP11ClientFactory object at 0x7f61e6966978>'
2017-01-18 00:06:13,667 c2a3a87c4d08    23      twisted INFO    b'Stopping factory <twisted.web.client._HTTP11ClientFactory object at 0x7f61e6966978>'
2017-01-18 00:06:14,246 c2a3a87c4d08    23      twisted INFO    b'Starting factory <twisted.web.client._HTTP11ClientFactory object at 0x7f61e69ef8d0>'
2017-01-18 00:06:15,627 c2a3a87c4d08    23      twisted INFO    b'Stopping factory <twisted.web.client._HTTP11ClientFactory object at 0x7f61e69ef8d0>'
2017-01-18 00:06:15,854 c2a3a87c4d08    23      twisted INFO    b'Starting factory <twisted.web.client._HTTP11ClientFactory object at 0x7f61e6b15400>'
2017-01-18 00:06:17,335 c2a3a87c4d08    23      twisted INFO    b'Stopping factory <twisted.web.client._HTTP11ClientFactory object at 0x7f61e6b15400>'
2017-01-18 00:06:17,566 c2a3a87c4d08    23      twisted INFO    b'Starting factory <twisted.web.client._HTTP11ClientFactory object at 0x7f61e696b438>'
2017-01-18 00:06:18,976 c2a3a87c4d08    23      twisted INFO    b'Stopping factory <twisted.web.client._HTTP11ClientFactory object at 0x7f61e696b438>'
2017-01-18 00:06:19,209 c2a3a87c4d08    23      twisted INFO    b'Starting factory <twisted.web.client._HTTP11ClientFactory object at 0x7f61e69ec9b0>'
2017-01-18 00:06:20,488 c2a3a87c4d08    23      twisted INFO    b'Stopping factory <twisted.web.client._HTTP11ClientFactory object at 0x7f61e69ec9b0>'
2017-01-18 00:06:20,997 c2a3a87c4d08    23      twisted INFO    b'Starting factory <twist

in our logs. My guess is this is something that fido is doing, and its inefficient. I do know we have been recently getting some TCPConnectionErrors frequently, and I'm wondering if its because we keep opening/closing tcp connections, instead of just re-using them.

pyOpenSSL and service_identify should be in an extras_require

Since fido doesn't actually import anything from pyOpenSSL and is meant to be installed as a library, we shouldn't be depending on it as an install_requires.

Instead we should use an extras_requires (see docs)

This way we don't have to depend on cryptography which is a pain to compile.

Underscore in url causes CRITICAL:twisted:Unhandled error in EventualResult

In [9]: fido.fetch('https://fake_platform_partner-evenitz.dev.yelp.com/checkout_fulfillment/v2/check_availability', timeout=10, method='POST').result(timeout=10).body
CRITICAL:twisted:Unhandled error in EventualResult
Traceback (most recent call last):
File "/nail/home/spatel/venv_crap/lib/python2.6/site-packages/twisted/internet/base.py", line 1194, in run
self.mainLoop()
File "/nail/home/spatel/venv_crap/lib/python2.6/site-packages/twisted/internet/base.py", line 1203, in mainLoop
self.runUntilCurrent()
File "/nail/home/spatel/venv_crap/lib/python2.6/site-packages/twisted/internet/base.py", line 798, in runUntilCurrent
f(_a, *_kw)
File "/nail/home/spatel/venv_crap/lib/python2.6/site-packages/crochet/_eventloop.py", line 416, in runs_in_reactor
d = maybeDeferred(function, _args, *_kwargs)
--- ---
File "/nail/home/spatel/venv_crap/lib/python2.6/site-packages/twisted/internet/defer.py", line 150, in maybeDeferred
result = f(_args, *_kw)
File "/nail/home/spatel/venv_crap/lib/python2.6/site-packages/fido/fido.py", line 100, in fetch_inner
bodyProducer=bodyProducer)
File "/nail/home/spatel/venv_crap/lib/python2.6/site-packages/twisted/web/client.py", line 1559, in request
endpoint = self._getEndpoint(parsedURI)
File "/nail/home/spatel/venv_crap/lib/python2.6/site-packages/twisted/web/client.py", line 1543, in _getEndpoint
return self._endpointFactory.endpointForURI(uri)
File "/nail/home/spatel/venv_crap/lib/python2.6/site-packages/twisted/web/client.py", line 1419, in endpointForURI
tlsPolicy = self._policyForHTTPS.creatorForNetloc(uri.host, uri.port)
File "/nail/home/spatel/venv_crap/lib/python2.6/site-packages/twisted/web/client.py", line 876, in creatorForNetloc
trustRoot=self._trustRoot)
File "/nail/home/spatel/venv_crap/lib/python2.6/site-packages/twisted/internet/_sslverify.py", line 1233, in optionsForClientTLS
return ClientTLSOptions(hostname, certificateOptions.getContext())
File "/nail/home/spatel/venv_crap/lib/python2.6/site-packages/twisted/internet/_sslverify.py", line 1105, in init
self._hostnameBytes = _idnaBytes(hostname)
File "/nail/home/spatel/venv_crap/lib/python2.6/site-packages/twisted/internet/_sslverify.py", line 87, in _idnaBytes
return idna.encode(text)
File "/nail/home/spatel/venv_crap/lib/python2.6/site-packages/idna/core.py", line 354, in encode
result.append(alabel(label))
File "/nail/home/spatel/venv_crap/lib/python2.6/site-packages/idna/core.py", line 275, in alabel
check_label(label)
File "/nail/home/spatel/venv_crap/lib/python2.6/site-packages/idna/core.py", line 252, in check_label
raise InvalidCodepoint('Codepoint {0} at position {1} of {2} not allowed'.format(_unot(cp_value), pos+1, repr(label)))
idna.core.InvalidCodepoint: Codepoint U+005F at position 5 of u'fake_platform_partner-evenitz' not allowed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.