Comments (15)
I'm ok with @rgaudin proposition, so since the decision is on us, we will implement it:
- store
original_schedule_name
in requested_tasks and tasks - expose the
original_schedule_name
to the API along theschedule_name
which will remain empty in youzimit. This way, we'd expose a truthful situation: there is no schedule_name (because there is no schedule) but there is an original one. - adapt UI to a missing schedule_name and use the original one.
from zimfarm.
Tentative UI, feedbacks welcomed:
new_recipe
schedule has been deleted so it is not anymore a linktargetedjustice.com_5f8aa391
schedule is still there- this situation where both cases are present at the same time will usually never occur, since we either have all schedules still present (zimfarm) or no schedules present (youzimit)
from zimfarm.
Indeed, when we migrated from mongo to psql, we linked the [requested]tasks
with the schedule
that created them.
What was once a string recording the schedule name now points to the name of the linked-schedule.
This has no impact on Zimfarm because schedules are present but on youzim.it one, there's no link anymore as soon as you delete the schedule (which we do as soon as we request a task from the schedule).
We figured loosing the schedule name was a minor inconvenience because this zimfarm UI is solely used for debugging and was only helping when browsing the pipeline.
We could bring back that feature by either:
- storing the
schedule_name
in those tables and link via it instead of the ID - storing both the schedule_id and the schedule_name. We'd have to maintain consistency on renames.
- adding a blank
schedule_name
that only gets filled on schedule removal
@benoit74 what do you think?
from zimfarm.
Thanks for the explanation. I agree the schedule name is somewhat useless, especially since clicking on it always just gave a 404 anyways (since the schedule was deleted). On the other hand, it gave a way to discriminate tasks when looking at a long list.
from zimfarm.
A fourth solution ;-)
Add an original_schedule_name
field on requested_task
and task
that is populated only on task creation and used in the API to compute schedule_name
when schedule is missing (i.e. has been deleted).
Benefits :
- more straightforward to populate this field at
requested_task
andtask
creation (rather than "not forgetting to set this at schedule deletion" or "maintain consistency") - name with a lot of meaning
- no change on IDs / relational fields
- no need to maintain this value on schedule renaming, it is clearly the original name, and on youzimit the schedule is immediately deleted so the name won't change, and on the "regular" farm we usually keep the schedule so we can continue to use it
from zimfarm.
OK, works for me
from zimfarm.
Should we add another field schedule_missing
in the task
and requested_tadk
API so that the UI knows about it and does not display a non-working link ?
I can work on it on Thursday at the latest.
from zimfarm.
We could add the schedule_id which is an actual information that might be useful later.
Hi could react to it to toggle link display.
from zimfarm.
Makes sense.
Btw, I realized that when filtering by schedule name, it will now necessary to also take into account the original_schedule_name
data when there is no more schedule.
from zimfarm.
An other solution would be to delete the schedule somehow later (at the same time like the task?).
from zimfarm.
Indeed, this would probably make a lot of things easier.
We could :
- add a
marked_for_deletion
field on schedules - refuse to delete a schedule if there are still associated task
- allow to mark a schedule for deletion
- when deleting tasks, also delete their schedule marked for deletion if there are no more linked task
- bring back integrity constraints on the database (a task must have an associated schedule)
I don't know how this would interact with the fact that we need more auditability of actions performed on the zimfarm, which could also lead to the need to keep older schedule configurations to check how the schedule has been configured.
from zimfarm.
Makes sense. Btw, I realized that when filtering by schedule name, it will now necessary to also take into account the
original_schedule_name
data when there is no more schedule.
We don't want to do that.
The point of having something clearly named original_schedule_name
is to avoid things like this. But indeed as suggested above, that would have been just on the model and not on the API.
I suggest we expose the original_schedule_name
to the API along the schedule_name
which will remain empty. This way, we'd expose a truthful situation: there is no schedule_name (because there is no schedule) but there is an original one.
UI would adapt to a missing schedule_name
and use the original one.
The filters thus needs not to be updated. We're not using it in UI anyway so it would not be visible.
An other solution would be to delete the schedule somehow later (at the same time like the task?).
The thing is that we don't delete the tasks and we don't want to. In particular for youzim.it which is mostly unsupervised, those tasks are useful for statistics or inquiry.
Also, it makes more semantic sense to delete the schedule because its role is just to repeatedly create tasks. So if it's a unique task, schedule has no reason to stay.
from zimfarm.
This regression is annoying and makes really difficult to find tasks at farm.youzim.it. I would appreciate if this is fixed soon.
I have no strong opinion about the way to fix it. But from high level I see that:
- The root recipe to make a task (we call it a schedule) is pretty much useless to keep over time for farm.youzim.it
- Unfortunately this use case has not been though 100% through (therefore this regression)
- Either we support/make it right OR we stop creating an "excpetional" situation (and we stop to delete shedules).
- I prefer to keep things simple and create exceptions only if absolutly necessary. Here we can have millions of rows in the DB without problems.
@benoit74 @rgaudin Please make a decision and fix this regression.
from zimfarm.
TODO is probably the least useful view for this. DOING/DONE/FAILED would have both the not-linked schedule name as well as the linked task ID
from zimfarm.
Yes, I just wanted to ensure that we are all aligned on this "worst case" (I'm fine with it).
from zimfarm.
Related Issues (20)
- Cancelling a task is not resilient enough HOT 3
- Worker downtime is wrong
- Zim files created in dev should default to manual refresh HOT 2
- Relaxed filename is too relaxed HOT 1
- Fix sotoki scraper configuration
- Watcher is not detecting updated files when SE is fixing issues HOT 3
- Upload logs and artifacts even when task is cancelled
- Name resolution errors in StackExchange watcher
- Configuration of Zimfarm local instance HOT 1
- Installing monitoring of a task is failing HOT 2
- Workers: report real disk size (and CPU / RAM) HOT 7
- StackExchanger watcher regularly restarts after connection reset by peer errors HOT 1
- Is the worker incorrectly checking the space already consumed by the Zimfarm? HOT 8
- Collect zimfarm usage statistics
- Some Zimfarm task are never finishing while the process is obviously finished HOT 5
- Adapt Zimfarm to MWoffliner 1.14 HOT 3
- Ensure we report MiB/GiB/... values everywhere
- Migrate to MWoffliner 1.14 HOT 6
- Recipe edit form consider there is always a pending modification
- Remove all control characters in task properties
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zimfarm.