Git Product home page Git Product logo

Comments (7)

martindurant avatar martindurant commented on August 14, 2024 1

HTTPFileSystem might now return a HTTPStreamFile where previously it returned a raw file-like requests response object. I don't think this changes anything from dask's point of view, except that we don't even try the "lets see if this is smaller than a block" approach. A retry would have to be for the whole of the request, not each call to read. However, a retry on establishing the connection (here) would make sense.

from dask-examples.

martindurant avatar martindurant commented on August 14, 2024

I'm not sure there's much we can do about broken connections, I can't see that it could be any fault of ours; retries could be built into the HTTPFileSystem, but perhaps it's better to retry the whole tasks in such cases.

from dask-examples.

mrocklin avatar mrocklin commented on August 14, 2024

Is there a good reason to avoid retries in HTTPFileSystem?

from dask-examples.

martindurant avatar martindurant commented on August 14, 2024

No, but a couple of things that make it tricky:

  • it is tricky to consider which set of errors should lead to a retry. Perhaps would have to retry everything
  • some things, like establishing the initial connection, are already retried by requests/urllib
  • if it's a timeout, then a set of retries might take a very long time to fail
  • in the fsspec implementation, there is a non-seekable fallback mode when the file-size is unavailable, that gives you a requests file-like object rather than a HTTPFile. I don't think we can easily intercept its read methods for the purposes of catching errors.

from dask-examples.

martindurant avatar martindurant commented on August 14, 2024

This SO answer might be the best way to do it globally: https://stackoverflow.com/a/15431343/3821154 , allows you to be explicit about retries following a connection error that should apply to all connections within a session

from dask-examples.

ahirner avatar ahirner commented on August 14, 2024

Quite some refactoring of fsspec's HTTP implementation lately.

Are dask tests still flaky?
AFAICS, fsspec now returns an HTTPFile even if range requests are not possible. Does that mean a retry policy in fsspec makes more sense now @martindurant?

from dask-examples.

martindurant avatar martindurant commented on August 14, 2024

(feel free to implement that in a PR, in case you have the time)

from dask-examples.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.