Git Product home page Git Product logo

Comments (3)

romw avatar romw commented on May 31, 2024

Commented by MikeMarsUK on 6 Nov 37316222 08:53 UTC
This would be '''extremely''' useful from CPDNs viewpoint. The vast majority of crashes are due to transient problems during the 3 months or so the model needs to run (windows killing the workunit at shutdown, files locked by antivirus programmes, the graphics drivers collapsing, general PC instability).

http://climateapps1.oucs.ox.ac.uk/beta/forum_thread.php?id=84&nowrap=true#506

One problem with current ad-hoc backup strategies is with multicore systems and multiproject systems - everything has to be rolled back as a unit even if it was only one result which crashed.

For reference here is an autobackup tool which was written by RRodway (a participant on the BBC project):
http://bbc.cpdn.org/forum_thread.php?id=2748

from boinc.

romw avatar romw commented on May 31, 2024

Commented by Didactylos on 17 May 37879861 17:46 UTC
This should be resolved by treating transient errors correctly, and restarting from checkpoints.

There's still a case for a backup feature, though, so I'm leaving this open. Let's reassess during 6.0 alpha testing, and decide whether BOINC's error handling still needs supplementing.

from boinc.

lfield avatar lfield commented on May 31, 2024

As far as I understand, this very old issue suggests retrying a crashed wu assuming some general PC instability. The system already does this and places limits on the number of times a wu can be retried. The only savings would be bandwidth with respect to the downloading of the wu input files somewhere else and reducing the number of errors reported. On the other hand, a wu might be run multiple times on an unstable environment. My view is that this optimization would add complexity for little real gain. It would be better to fix the underlying in stability. The overall system is already designed to work well in such an environment.

from boinc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.