Allow user to specify custom names for array jobs, instead of naming by index. <p

closed in <a class="commit-link" data-hovercard-type="commit" data-hovercard-url="http

Feature Request: Custom job names about canine HOT 5 CLOSED

agraubert commented on August 16, 2024 1

Feature Request: Custom job names

from canine.

Comments (5)

agraubert commented on August 16, 2024

So right now, all jobs on SLURM are indexed using $SLURM_ARRAY_TASK_ID and execute job-specific setup by sourcing {staging_dir}/jobs/$SLURM_ARRAY_TASK_ID/setup.sh. That means one of three things needs to happen to make custom job names work out:

In pipelines with custom names, canine can generate a {staging_dir}/aliases file, with one job alias per line. Jobs can read line $SLURM_ARRAY_TASK_ID to determine their name, then continue setup by running {staging_dir}/jobs/$CANINE_JOB_ALIAS/setup.sh
The jobs directory should jointly encode the task id and custom name (ie: {staging_dir}/jobs/0_foo/) so that jobs can source {staging_dir}/jobs/${SLURM_ARRAY_TASK_ID}_*/setup.sh. This would allow jobs to jump straight to the correct directory, while still keeping them human-readable.
In pipelines with custom names, canine should symlink {staging_dir}/alias/{custom name} to {staging_dir}/jobs/{proper job id}. That way, jobs can continue to launch as normal, and humans can inspect the workspace by browsing the alias directory

At the moment, I'm leaning towards option 2, because it seems like the simplest change to achieve the desired goal. It also avoids any uniqueness requirements because the outputs/ folder could also follow the same id_alias naming scheme.

@hurrialice @julianhess what are your thoughts?

from canine.

hurrialice commented on August 16, 2024

I like the first one best - I just want a table to trace my jobs and this does not really need to be reflected in the file structure.

If we will have a table of aliases - is it possible that we combine with #5 ?
A possible table format could be -
<job_id> <custome_name> <job_status>

from canine.

agraubert commented on August 16, 2024

Okay, so it seems like overall, nobody really needs the jobs/ directory to be labeled with entity names, so here's my compromise:

jobs/ stays numbered by the array task id
Custom aliases are set within setup.sh like other canine variables
The output/ folder will use custom aliases (which requires that the aliases all be unique)
The job alias will be included in the output dataframe from Orchestrator.run_pipeline() a la #5

from canine.

hurrialice commented on August 16, 2024

That is beautiful! 👏

from canine.

agraubert commented on August 16, 2024

closed in 63cc655

from canine.

Recommend Projects

Feature Request: Custom job names about canine HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent