Comments (7)
Some of our user communities also think that a benefit of tape is that the files aren't immediately gone in case of an accident, but this is not really the case with Endit. So adding a removal latency might be good for that purpose too.
One suggestion: Aggregate removes to daily files, archive these in the proxynode with a consistent and pool-unique name, like trash/hostname.poolname.yyyymmdd. Then a single pool (chosen by the admin) has a task on the 1st of every month (cron?) that retrieves all the trash files and removes files from month n-2 and older.
from endit.
I like the idea of storing the deletion queue in the TSM proxynode and processing it on a selected pool.
However, it'll likely be easier to implement configurable deletion holdoff time and any other knobs if we drive that in tsmdeleter.
Requiring users to configure an explicit deletion cronjob/timer feels unnecessary when we already have tsmdeleter running related tasks, and it will be one extra task to get right when upgrading/installing...
A related question though, if we're looking at a month-long deletion holdoff as a recommendation, is it worth the effort of adding volume-awareness at all?
from endit.
I think we can skip the volume-awareness when instead batching up month+ worth of deletions.
The only thing I think really needs configuring is which of the pools should do the regularly scheduled task. Unless we want to either issue all deletes from all pools or implement some kind of voting/locking process.
from endit.
Of course, storing small temporary files via the TSM node will need some additional setup on the TSM side of things to store them on disk, we don't want these tiny files to need a tape mount...
I'll do some investigation on what we can detect from the tsm client side wrt setup, then we can discuss what a reasonable behaviour would be.
from endit.
Just collect all delete requests on the pools and the let the pools issue the deletes on a given, configurable, time? Then there is no need to reroute delete requests or save them up somewhere and very minimal changes to the code is needed.
from endit.
Just collect all delete requests on the pools and the let the pools issue the deletes on a given, configurable, time? Then there is no need to reroute delete requests or save them up somewhere and very minimal changes to the code is needed.
It would indeed be the simplest thing to do implementation-wise, if we don't need to do volume awareness (with the per-volume tracking of deletions-in-progress etc) there is little gain by collecting all deletion requests to a single place.
However, one of the neat things about storing the deletion requests in TSM would be less long-lived state data on the pool/ENDIT instance. We would need to be very clear documentation-wise what needs to be done when a pool/ENDIT instance is moved (usually as part of hw renewal) in order to not loose deletions. It could be as simple as just stating that any files left in trash/
are unprocessed, and that you either have to let the old pool/ENDIT instance run until they've been processed, or that the admin must move those files to the new pool/ENDIT instance...
from endit.
Fixed by e3d762f
from endit.
Related Issues (20)
- Centralised logging HOT 1
- Add configurable short/long descriptions HOT 1
- Document how to find/delete duplicate archived files HOT 6
- Revert tsmarchiver to old behaviour of day/month in description of archived files HOT 1
- Add possibility to use dsmc query archive -detail output for tapehints
- Investigate removing use of IPC::Run3
- Do chdir / on startup to avoid cwd being in deleted directory
- Packaging endit scripts HOT 1
- Implement backoff when retrying dsmc operations
- Add force-flush/recall via signal handler HOT 1
- tsmarchiver: Be more aggressive when retrying
- Refactor archiver to spawn multiple single-drive dsmc processes instead of varying drive use of a single process
- Reload config automatically/dynamically
- Add cputime limit to dsmc processes
- Properly document Prometheus stats file dirs
- Prometheus counters for bytes stored and retrieved
- Handle Server disabled errors more gracefully. HOT 2
- Review and clean up installation instructions HOT 1
- tsmretriever: cleanup of in/ on startup HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from endit.