Git Product home page Git Product logo

Comments (7)

maswan avatar maswan commented on August 26, 2024

Some of our user communities also think that a benefit of tape is that the files aren't immediately gone in case of an accident, but this is not really the case with Endit. So adding a removal latency might be good for that purpose too.

One suggestion: Aggregate removes to daily files, archive these in the proxynode with a consistent and pool-unique name, like trash/hostname.poolname.yyyymmdd. Then a single pool (chosen by the admin) has a task on the 1st of every month (cron?) that retrieves all the trash files and removes files from month n-2 and older.

from endit.

ZNikke avatar ZNikke commented on August 26, 2024

I like the idea of storing the deletion queue in the TSM proxynode and processing it on a selected pool.

However, it'll likely be easier to implement configurable deletion holdoff time and any other knobs if we drive that in tsmdeleter.
Requiring users to configure an explicit deletion cronjob/timer feels unnecessary when we already have tsmdeleter running related tasks, and it will be one extra task to get right when upgrading/installing...

A related question though, if we're looking at a month-long deletion holdoff as a recommendation, is it worth the effort of adding volume-awareness at all?

from endit.

maswan avatar maswan commented on August 26, 2024

I think we can skip the volume-awareness when instead batching up month+ worth of deletions.

The only thing I think really needs configuring is which of the pools should do the regularly scheduled task. Unless we want to either issue all deletes from all pools or implement some kind of voting/locking process.

from endit.

ZNikke avatar ZNikke commented on August 26, 2024

Of course, storing small temporary files via the TSM node will need some additional setup on the TSM side of things to store them on disk, we don't want these tiny files to need a tape mount...

I'll do some investigation on what we can detect from the tsm client side wrt setup, then we can discuss what a reasonable behaviour would be.

from endit.

nsc-jens avatar nsc-jens commented on August 26, 2024

Just collect all delete requests on the pools and the let the pools issue the deletes on a given, configurable, time? Then there is no need to reroute delete requests or save them up somewhere and very minimal changes to the code is needed.

from endit.

ZNikke avatar ZNikke commented on August 26, 2024

Just collect all delete requests on the pools and the let the pools issue the deletes on a given, configurable, time? Then there is no need to reroute delete requests or save them up somewhere and very minimal changes to the code is needed.

It would indeed be the simplest thing to do implementation-wise, if we don't need to do volume awareness (with the per-volume tracking of deletions-in-progress etc) there is little gain by collecting all deletion requests to a single place.

However, one of the neat things about storing the deletion requests in TSM would be less long-lived state data on the pool/ENDIT instance. We would need to be very clear documentation-wise what needs to be done when a pool/ENDIT instance is moved (usually as part of hw renewal) in order to not loose deletions. It could be as simple as just stating that any files left in trash/ are unprocessed, and that you either have to let the old pool/ENDIT instance run until they've been processed, or that the admin must move those files to the new pool/ENDIT instance...

from endit.

ZNikke avatar ZNikke commented on August 26, 2024

Fixed by e3d762f

from endit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.